Amazon Aurora: MySQL/PostgreSQL-Compatible Database With Cloud-Native Architecture
Amazon Aurora started from a specific observation: the bottleneck in most cloud databases is not compute or memory — it is the constant round-trips between the database engine and the storage layer. Aurora was redesigned from the ground up to separate storage from compute and push as much intelligence as possible into the distributed storage layer itself.
The result is a relational database that is wire-compatible with MySQL and PostgreSQL (your existing drivers work unchanged) but built on fundamentally different infrastructure. AWS claims Aurora delivers up to five times the throughput of standard MySQL and three times that of PostgreSQL on the same hardware — gains that come from the storage architecture, not from faster CPUs.
The Storage Layer: Why Aurora Is Different
Standard RDS replicates the storage layer once, to a standby in another AZ. Aurora does something different: it writes to a distributed, fault-tolerant storage cluster that spans three Availability Zones, with two copies of data in each AZ — six copies total.
Aurora Storage Architecture ============================
Writer Instance (compute) │ │ (writes go to storage cluster, not to a replica instance) ▼ ┌──────────────────────────────────────────────────────────┐ │ Aurora Distributed Storage (automatically managed) │ │ │ │ AZ-1a AZ-1b AZ-1c │ │ [Copy 1] ─── [Copy 3] ─── [Copy 5] │ │ [Copy 2] ─── [Copy 4] ─── [Copy 6] │ │ │ │ Quorum writes: 4 of 6 must acknowledge │ │ Quorum reads: 3 of 6 suffice │ │ │ │ Auto-grows in 10 GB increments, up to 128 TB │ └──────────────────────────────────────────────────────────┘
Key: Reader instances also access the same storage cluster No data copying between writer and readers Failover is fast because readers already have the dataThis quorum approach means Aurora tolerates losing two copies without data loss, and even tolerates two full AZ failures while remaining readable (though writes require the quorum). Database writes only go across the network once — to the storage nodes — rather than through the engine twice (engine to disk to replica). That single-hop write path is a significant source of Aurora’s performance advantage.
Cluster Architecture
An Aurora cluster consists of one writer instance and up to 15 reader instances, all sharing the same storage cluster. Readers can serve SELECT queries immediately because they have access to the same data as the writer — there is no replication lag to worry about in the traditional sense.
When the writer fails, Aurora promotes one of the existing readers. Because the reader already has the current state of the storage layer (shared storage), promotion is fast — typically 30 seconds or less, compared to 60-120 seconds for standard RDS Multi-AZ.
Applications connect to two cluster endpoints:
- Writer endpoint (cluster endpoint): always resolves to the current writer instance
- Reader endpoint: load-balances connections across all available reader instances
Aurora Cluster Endpoints ========================
app writes ──► cluster.cluster-xxx.us-east-1.rds.amazonaws.com ──► Writer (always routes to writer)
app reads ──► cluster.cluster-ro-xxx.rds.amazonaws.com ──► Reader 1 (load balances) └──► Reader 2 └──► Reader 3Aurora Serverless
Aurora Serverless v2 is an on-demand, auto-scaling configuration for the Aurora cluster. Rather than choosing an instance size, you specify a minimum and maximum number of Aurora Capacity Units (ACUs). The database scales compute capacity up when demand increases and back down (including to near-zero) when the application is idle.
This makes Aurora Serverless v2 useful for:
- Development and test environments that run intermittently
- Applications with unpredictable traffic patterns (event-driven workloads, SaaS applications with variable tenant activity)
- New applications where sizing is uncertain
Aurora Serverless v2 scales in increments of 0.5 ACUs and can scale from minimum to maximum in under a second. This is meaningfully different from the original Serverless v1, which had cold start delays and less granular scaling.
Aurora Global Database
Aurora Global Database replicates a single Aurora cluster to up to five additional AWS regions. Writes happen in one primary region; the storage layer replicates to secondary regions with typical lag under one second.
Aurora Global Database ======================
Primary Region (us-east-1) Secondary Region (eu-west-1) ────────────────────────── ────────────────────────── Writer Instance Reader Instances (low latency for EU users) │ │ Aurora Storage ──replication (~1s)──► Aurora Storage (primary cluster) (secondary cluster, read-only)
Failover: promote eu-west-1 to primary in ~1 minute Use case: global SaaS, disaster recovery with cross-region failoverSecondary regions serve read traffic with local latency — a user in Frankfurt reads from the EU region rather than making a transatlantic request to Virginia. For disaster recovery, you can promote a secondary region to primary in about a minute.
Aurora vs Standard RDS: When to Choose Which
Aurora is not always the right choice despite its advantages:
| Factor | Choose Aurora | Choose Standard RDS |
|---|---|---|
| Performance needed | Very high throughput | Moderate workload |
| Cost | Higher instance cost | Lower instance cost |
| Failover speed | <30 seconds | 60-120 seconds |
| MySQL/PostgreSQL compat | Full compatibility | Full compatibility |
| Serverless option | Aurora Serverless v2 | Not available |
| Global replication | Global Database | Cross-region read replicas |
| Oracle / SQL Server | Not supported | Supported |
If you run Oracle or SQL Server, Aurora is not an option — those engines are not available. For MySQL or PostgreSQL workloads where cost matters more than maximum performance, standard RDS may be the right call.
Real-World Use Case: Gaming Leaderboard
A mobile game has 10 million players. The leaderboard service handles thousands of score updates per second during peak hours and millions of leaderboard reads. The team chose Aurora PostgreSQL with:
- One writer instance (db.r6g.4xlarge) handling score writes
- Three reader instances handling leaderboard queries
- Aurora Serverless v2 minimum set low for off-peak hours (2 AM to 8 AM) when traffic drops
- Storage auto-scales — no one needs to track how much data the database will accumulate over time
During a major in-game event, traffic spikes 10x. Readers scale up via Aurora Serverless v2 capacity. When the event ends, they scale back down. No one opens a support ticket to resize instances.
Key Interview Points
- Six copies across three AZs: Aurora writes to six storage nodes, requiring 4 of 6 to acknowledge — losing two copies or up to two AZs does not lose data
- Aurora storage is separate from compute: adding a reader does not copy data — the reader accesses the same shared storage cluster immediately
- Failover is faster than standard RDS because readers already have access to the same storage; they do not need to sync first
- Aurora is MySQL or PostgreSQL compatible — not a separate query language; your existing application code works unchanged
- Aurora Serverless v2 vs v1: v2 scales in fine-grained increments and can run in a Multi-AZ cluster; v1 has cold start delays and coarser scaling. v1 is legacy.
- Global Database is not the same as cross-region read replicas in standard RDS — Global Database operates at the storage layer with faster replication and supports managed failover
- Maximum storage is 128 TB, and it grows automatically in 10 GB increments — you never resize storage manually