Amazon RDS: Managed Relational Databases Without DBA-Level Infrastructure Work
Running a relational database in the cloud comes with a choice: manage the EC2 instance yourself (install PostgreSQL or MySQL, configure replication, set up automated backups, apply security patches, monitor disk space) or use a managed service. Amazon RDS makes the managed choice practical by handling the operational work while leaving you full control over schema design, query tuning, and configuration parameters.
The managed service model matters most at 2 AM when a primary database fails. Without RDS, someone needs to wake up, assess the failure, promote a standby, and update connection strings. With RDS Multi-AZ, the failover happens automatically in 60-120 seconds. That trade-off — less control over the infrastructure, more time for application development — is why most production web applications on AWS use RDS rather than self-managed databases on EC2.
Supported Database Engines
RDS supports six database engines:
- MySQL (versions 5.7, 8.0)
- PostgreSQL (versions 13, 14, 15, 16)
- MariaDB
- Oracle (Standard Edition 2, Enterprise Edition)
- Microsoft SQL Server (Express, Web, Standard, Enterprise editions)
- Amazon Aurora (covered separately — Aurora is an AWS-built engine that happens to be compatible with MySQL and PostgreSQL)
The choice of engine affects licensing cost (Oracle and SQL Server require licenses), maximum storage, and available features. Most new projects choosing a relational database choose PostgreSQL or MySQL because they are open-source, well-supported, and lack additional licensing fees.
Multi-AZ: High Availability
Multi-AZ deployments maintain a synchronous standby replica in a different Availability Zone. Every write to the primary is synchronously replicated to the standby before AWS acknowledges the write to the application. If the primary instance fails — hardware failure, AZ outage, scheduled maintenance — RDS automatically promotes the standby and updates the DNS endpoint to point to it.
RDS Multi-AZ Architecture ==========================
Application │ ▼ (connects to DNS endpoint, not IP) mydb.cluster.us-east-1.rds.amazonaws.com │ ├──► Primary (AZ-1a) ──synchronous replication──► Standby (AZ-1b) │ │ │ │ Writes Standby │ accepted (not accessible │ for reads in standard │ Multi-AZ) │ Failover event (AZ-1a fails): │ └──► DNS endpoint updates to point to former Standby (now Primary) in AZ-1b (typically 60-120 seconds)The standby in standard Multi-AZ is not a read replica — it receives writes but does not serve reads. The purpose is purely availability, not performance. For read scaling, use read replicas.
Read Replicas: Scaling Reads
A read replica is an asynchronous copy of the primary database that serves SELECT queries. You can have up to 15 read replicas for MySQL, MariaDB, and PostgreSQL. Applications direct their read traffic to replica endpoints and write traffic to the primary endpoint.
Read Replica Pattern ====================
Write traffic READ traffic (reports, analytics, search) │ │ │ │ ▼ ▼ ▼ ▼ Primary DB Replica 1 Replica 2 Replica 3 (all writes) (reads) (reads) (reads)
Primary replicates asynchronously to all replicas Replica lag is typically sub-second for healthy deployments Replicas can be in different regions (cross-region replication)Cross-region read replicas reduce latency for globally distributed applications and serve as a recovery point for regional disasters. You can also promote a read replica to become a standalone database — useful for breaking off a read-heavy analytics workload into its own independent database.
Automated Backups and Snapshots
RDS creates automated backups daily during a configurable backup window, retaining them for a period you specify (1-35 days). These backups support point-in-time recovery — you can restore to any second within the retention period, not just the moment the backup ran. Transaction logs are backed up every 5 minutes.
Manual snapshots are separate from automated backups. They persist until you delete them, even if you delete the RDS instance. This is important for long-term retention beyond the 35-day automated backup limit.
Parameter Groups and Option Groups
Parameter Groups are containers for database engine configuration. Instead of editing my.cnf on a server, you set parameters like max_connections, innodb_buffer_pool_size, or work_mem in a parameter group and associate it with your RDS instance. Changes to static parameters require a reboot; dynamic parameters apply immediately.
Option Groups (MySQL and Oracle) manage optional features like Oracle TDE, MySQL memcached, or SQL Server TLS settings.
Storage and Performance
RDS storage is backed by EBS and comes in three types:
- gp3 SSD (general purpose): baseline 3,000 IOPS, suitable for most workloads
- io1 SSD (provisioned IOPS): up to 40,000 IOPS, for high-transaction databases
- Magnetic (deprecated): avoid for new deployments
Storage Auto Scaling can automatically expand your RDS storage when it approaches capacity, up to a maximum you configure.
Real-World Use Case: E-Commerce Platform
An e-commerce platform uses RDS PostgreSQL with this configuration:
- Primary instance: db.r6g.2xlarge, Multi-AZ enabled, 500 GB gp3 storage
- 2 read replicas: same instance type, handle product search and order history queries
- Automated backups: 14-day retention, point-in-time recovery for accidental data deletion
- Parameter Group: tuned
max_connections,shared_buffers, andcheckpoint_completion_targetfor transactional workloads - CloudWatch alarms: notify when CPU exceeds 80%, free storage drops below 50 GB, or replica lag exceeds 30 seconds
This setup handles thousands of concurrent shoppers with automatic failover if the primary fails and read replicas absorbing the catalog browsing load.
Key Interview Points
- Multi-AZ vs Read Replica: Multi-AZ is for high availability (automatic failover, synchronous replication, standby not readable); Read Replica is for performance (async replication, readable, manual promotion only)
- Failover time: Multi-AZ failover takes 60-120 seconds because it relies on DNS TTL propagation — your application must handle connection retries
- RDS cannot SSH: you do not have OS-level access to RDS instances; for that level of control use EC2 with self-managed database
- IAM authentication: RDS supports IAM database authentication for MySQL and PostgreSQL — no static database passwords, tokens instead
- Encryption: enabling encryption on an existing unencrypted RDS instance requires creating a snapshot, copying it as encrypted, then restoring — you cannot encrypt in-place
- Storage type for io1: maximum IOPS depends on the ratio of IOPS to storage — the maximum is 50 IOPS per GB for io1
- Cross-region read replicas do not provide automatic failover — they require manual promotion, unlike Multi-AZ which is automatic