AWS Database Migration Service: Migrate Databases to AWS With Minimal Downtime
Moving a production database to a new environment is one of the highest-risk activities in operations. A traditional migration approach — stop the application, export the data, import to the new system, redirect — might mean 4-12 hours of downtime for a large database. For a business that cannot afford that window, the export-and-import model is not viable.
AWS Database Migration Service (DMS) was designed around a different approach: replicate continuously from source to target while both are running, validate the data, and cut over at a time of your choosing with minutes of actual downtime instead of hours.
How DMS Works
DMS migrates data through three components:
[Source Database] [Replication Instance] [Target Database]Oracle, MySQL, <---- EC2 instance managed ----> RDS Aurora,SQL Server, by DMS that runs Redshift,PostgreSQL, the migration task DynamoDB,MongoDB, etc. S3, etc.
Source Endpoint: Task: Target Endpoint:connection string + full-load / CDC connection string +credentials / full-load-and-CDC credentialsReplication instance: A managed EC2 instance that DMS provisions and manages. It reads data from the source endpoint and writes to the target endpoint. You choose the instance class based on the data volume and replication rate.
Endpoints: Connection configurations for the source and target. They store the database host, port, credentials, and any engine-specific settings. Credentials are stored in AWS Secrets Manager, not in the endpoint itself.
Tasks: The migration job that runs on the replication instance. A task specifies which tables to migrate, the migration type, and any transformation rules.
Task Types: Three Migration Modes
Full Load: Exports all selected tables from the source and loads them into the target. The source database continues operating normally during the load. The target is not in sync until the load completes. Use this when you have a maintenance window long enough to copy all the data, or for initial population of a target before enabling CDC.
Change Data Capture (CDC) only: Reads the change stream of the source database (transaction logs) and replays only the changes on the target. The target must already be seeded with the data. Use this when you have an existing replica or a pre-loaded target and only need to keep it in sync.
Full Load and CDC: The most common production migration pattern. DMS first runs a full load of the selected tables. Once the full load finishes, DMS automatically switches to CDC mode and begins replicating all changes that occurred since the load started. The target stays in sync with the source indefinitely. When you are ready to cut over, stop new writes to the source, wait for DMS to apply the last few seconds of changes, redirect applications to the target.
Full Load + CDC Migration Timeline:
Day 1 Day 3 Day 5 | | | v v v[Start full load] [Full load done] [Application cutover] | | | +----full load---+------CDC-------+ Source writes Both dbs in sync Cut over (minutes of downtime) continueHomogeneous vs Heterogeneous Migrations
Homogeneous: Same database engine on source and target (MySQL → MySQL, Oracle → Oracle, PostgreSQL → PostgreSQL). DMS handles these directly — no schema conversion is needed because both engines understand the same SQL dialect and data types.
Heterogeneous: Different database engines (Oracle → PostgreSQL, SQL Server → Aurora, MySQL → DynamoDB). These require schema conversion because SQL dialects, data types, stored procedures, and constraints differ between engines. DMS handles the data movement; the AWS Schema Conversion Tool (SCT) handles the schema conversion.
Homogeneous Migration (MySQL → Aurora MySQL): SCT not needed DMS handles both schema and data Risk: Lower Typical use case: Lift and shift to managed RDS
Heterogeneous Migration (Oracle → PostgreSQL): Step 1: SCT converts Oracle DDL, views, procedures to PostgreSQL equivalents Step 2: DMS migrates the data Step 3: Manual review and testing of converted objects Risk: Higher (incompatible features need manual remediation) Typical use case: Re-platform to open source or cloud-native engineAWS Schema Conversion Tool (SCT)
SCT is a desktop application that analyses the source database schema and generates the equivalent schema for the target engine. It also assesses how much of the migration can be automated.
SCT produces an assessment report showing:
- Percentage of schema that converts automatically
- Objects that require manual intervention
- Complexity rating for each unconverted object
Commonly problematic objects:
- Oracle PL/SQL packages and procedures: PostgreSQL has PL/pgSQL but the syntax differs significantly
- SQL Server T-SQL specific functions: GETDATE() vs NOW(), TOP vs LIMIT
- Oracle-specific data types: NUMBER(p,s) maps to NUMERIC in PostgreSQL, CHAR semantics differ
- Database triggers with complex logic: Often need manual rewrite for target engine syntax
SCT does not fix code — it flags issues and provides suggestions. The conversion work for complex stored procedures is typically manual and is the biggest labour component of a heterogeneous migration.
Change Data Capture: How DMS Reads Changes
CDC captures changes by reading the database’s transaction log. Different source engines expose this differently:
| Engine | CDC Mechanism |
|---|---|
| Oracle | LogMiner or Oracle Binary Reader |
| SQL Server | SQL Server Agent, MS-CDC, or backup/restore |
| PostgreSQL | Logical replication (pglogical or pg_logical_replication_slot) |
| MySQL / Aurora MySQL | Binary log (binlog) |
| MongoDB | Oplog |
For CDC to work, the source database must have its transaction log enabled and retained long enough for DMS to read it. For MySQL, this means binlog_format = ROW. For PostgreSQL, wal_level = logical. These settings typically require a database restart to enable, which must be planned in advance.
Replication lag is the time between a change being committed to the source and that change being applied to the target. For most workloads, DMS keeps lag under a few seconds. For very high write rates, the replication instance may need a larger instance class.
Table Mapping and Transformation Rules
DMS uses JSON-defined table mapping rules to specify which tables to migrate and how to transform them during migration.
Selection rules include or exclude specific schemas or tables:
{ "rules": [ { "rule-type": "selection", "rule-id": "1", "rule-name": "include-sales-schema", "object-locator": { "schema-name": "sales", "table-name": "%" }, "rule-action": "include" }, { "rule-type": "selection", "rule-id": "2", "rule-name": "exclude-temp-tables", "object-locator": { "schema-name": "sales", "table-name": "tmp_%" }, "rule-action": "exclude" } ]}Transformation rules rename schemas, tables, or columns:
{ "rule-type": "transformation", "rule-id": "3", "rule-name": "lowercase-schema", "rule-action": "convert-lowercase", "rule-target": "schema", "object-locator": { "schema-name": "SALES" }}Replication Instance Sizing
The replication instance processes all data flowing between source and target. Undersizing it creates a bottleneck:
Indicators of undersized replication instance: - Replication lag increasing over time - High CPU utilisation on the instance (>70% sustained) - Disk I/O saturation - Swap space usage
General sizing guidance: - Small databases (<100 GB, <1,000 transactions/sec): dms.t3.medium - Medium databases (100-500 GB, moderate traffic): dms.r5.xlarge - Large databases or high-throughput: dms.r5.4xlarge or larger - Add 50% headroom for peak loads
Multi-AZ replication instance: - Hot standby in a second AZ - Automatic failover if the primary fails - Use for production migrations where replication continuity is criticalReal-World Scenario: Oracle to Aurora PostgreSQL Migration
A financial services company runs a 2 TB Oracle database (on-premises) that processes 500 transactions/second during peak hours. The goal is to migrate to Aurora PostgreSQL in AWS with under 5 minutes of application downtime during cutover.
Phase 1 (Week 1-2): Schema conversion
- Run SCT against the Oracle schema
- 78% converts automatically
- 22% requires manual work: 12 stored procedures with Oracle-specific cursor syntax, 4 packages with DBMS_SCHEDULER jobs
- Engineers manually rewrite the unconverted objects for PostgreSQL
Phase 2 (Week 3): Infrastructure
- Create Aurora PostgreSQL cluster in the target AWS account
- Provision DMS replication instance (r5.2xlarge, Multi-AZ)
- Configure source endpoint (Oracle on-premises, via Direct Connect)
- Configure target endpoint (Aurora PostgreSQL)
- Apply converted DDL to Aurora PostgreSQL
Phase 3 (Week 4): Initial load
- Start DMS Full Load + CDC task
- Full load completes in 18 hours (2 TB at ~30 MB/sec)
- CDC begins automatically, replication lag stabilises at 2-3 seconds
Phase 4 (Week 5): Validation and cutover
- Run automated row count and checksum validation between source and target
- Test application against Aurora PostgreSQL in staging
- Schedule cutover during Sunday 02:00-06:00 maintenance window
- Stop application, wait for DMS lag to reach zero (about 90 seconds), switch database connection string, start application
- Actual downtime: 4 minutes
Interview Notes
Q: What is the difference between Full Load and CDC in DMS? Full Load exports all existing data from the source and loads it into the target. It is a one-time operation and does not replicate changes that occur during the load. CDC replicates ongoing changes from the source transaction log to the target. Full Load + CDC combines both: initial data copy followed by continuous change replication.
Q: What does the Schema Conversion Tool (SCT) do? SCT converts database schema (DDL) from one database engine format to another. It analyses the source schema, converts what it can automatically, and flags objects that require manual remediation. DMS does not convert schemas — it only moves data. SCT is needed for heterogeneous migrations.
Q: What happens if the replication instance fails during a migration? With a Multi-AZ replication instance, a standby automatically takes over within 1-2 minutes. The task resumes from the last committed CDC position. With a single-AZ instance, the instance must be manually restarted or replaced, and the task resumes from the last saved state (which may require replaying some changes).
Q: Can DMS migrate to non-AWS targets? Yes, DMS supports on-premises databases and EC2-hosted databases as targets. The replication instance connects to any reachable endpoint. However, DMS is primarily designed for migrations to or within AWS, and cross-cloud migrations (AWS to GCP, for example) are not supported as native target endpoints.