AWS Route 53: DNS, Health Checks, and Routing Policies That Power Resilient Apps
Route 53 is named after TCP/UDP port 53, the standard DNS port. It provides DNS resolution, domain registration, and traffic routing with health checking — a combination that allows architectures that are genuinely multi-region and fault tolerant, not just multi-region in name only.
Most teams use Route 53 for basic DNS initially and then gradually discover the routing policies and health checks as they need more sophisticated traffic management.
Hosted Zones
A hosted zone is a container for DNS records for a domain. When you register a domain through Route 53 or transfer one in, Route 53 creates a public hosted zone automatically. You can also create private hosted zones that resolve only within a VPC.
# Create a public hosted zoneaws route53 create-hosted-zone \ --name example.com \ --caller-reference $(date +%s) \ --query 'HostedZone.Id' --output text
# Create a private hosted zone (VPC-only resolution)aws route53 create-hosted-zone \ --name internal.example.com \ --caller-reference $(date +%s)-private \ --vpc VPCRegion=us-east-1,VPCId=vpc-0abc123 \ --hosted-zone-config Comment="Private zone for internal services",PrivateZone=truePrivate hosted zones are used for internal service discovery — api.internal.example.com resolves to a private IP within your VPC, but returns NXDOMAIN from the public internet.
DNS Record Types
| Type | Purpose | Example |
|---|---|---|
| A | Maps hostname to IPv4 | api.example.com → 54.123.45.67 |
| AAAA | Maps hostname to IPv6 | api.example.com → 2001:db8::1 |
| CNAME | Maps hostname to another hostname | www → app.example.com |
| MX | Mail server records | 10 mail.example.com |
| TXT | Text records (SPF, DKIM, verification) | v=spf1 include:... ~all |
| NS | Name server records | Auto-created by Route 53 |
| SOA | Zone authority record | Auto-created by Route 53 |
| Alias | AWS-specific, maps to AWS resources | api → alb-1234.us-east-1.elb.amazonaws.com |
Alias Records: The Important AWS-Specific Concept
CNAME records cannot be used at the zone apex (you cannot have a CNAME for example.com itself — only for subdomains). Alias records solve this. An Alias record maps a hostname directly to an AWS resource’s DNS name:
- Application Load Balancer
- Network Load Balancer
- CloudFront distribution
- S3 website endpoint
- API Gateway endpoint
- Another Route 53 record
aws route53 change-resource-record-sets \ --hosted-zone-id ZXYZ123 \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "example.com", "Type": "A", "AliasTarget": { "HostedZoneId": "Z35SXDOTRQ7X7K", "DNSName": "my-alb-1234567890.us-east-1.elb.amazonaws.com", "EvaluateTargetHealth": true } } }] }'Alias records are free (unlike CNAME queries which count as queries). The EvaluateTargetHealth: true setting means Route 53 checks whether the ALB has healthy targets before returning it in DNS responses.
Health Checks
Route 53 health checks monitor endpoints and influence routing decisions. They can check:
- An endpoint by IP or domain name (HTTP, HTTPS, TCP)
- A CloudWatch alarm
- Another health check (calculated health checks)
# Create an HTTP health checkaws route53 create-health-check \ --caller-reference $(date +%s) \ --health-check-config '{ "IPAddress": "54.123.45.67", "Port": 443, "Type": "HTTPS", "ResourcePath": "/health", "FullyQualifiedDomainName": "api.example.com", "RequestInterval": 10, "FailureThreshold": 3, "EnableSNI": true }'With RequestInterval: 10 and FailureThreshold: 3, Route 53 marks the endpoint unhealthy after 30 seconds of failures. Route 53 health checkers are distributed globally; the endpoint must respond from at least 18% of health check locations to be considered healthy.
Routing Policies
Routing policies control how Route 53 responds to DNS queries. The choice determines how you distribute or direct traffic.
Simple Routing
One record, one value (or multiple values returned randomly). No health checking influence.
aws route53 change-resource-record-sets \ --hosted-zone-id ZXYZ123 \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "TTL": 300, "ResourceRecords": [{"Value": "54.123.45.67"}] } }] }'Weighted Routing
Multiple records for the same name, each with a weight. Route 53 distributes queries proportionally. Use case: canary deployments (send 5% to new version) or gradual traffic shifts.
# 90% to version 1aws route53 change-resource-record-sets \ --hosted-zone-id ZXYZ123 \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "SetIdentifier": "v1", "Weight": 90, "TTL": 60, "ResourceRecords": [{"Value": "54.123.45.67"}], "HealthCheckId": "hc-v1-abc123" } }, { "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "SetIdentifier": "v2", "Weight": 10, "TTL": 60, "ResourceRecords": [{"Value": "54.234.56.78"}], "HealthCheckId": "hc-v2-def456" } }] }'If you attach health checks, Route 53 skips unhealthy records even if they would win the weighted lottery.
Latency-Based Routing
Route 53 measures latency from the user’s location to each AWS region and returns the record for the lowest-latency region. This is the right choice for multi-region deployments where you want users automatically routed to the nearest region.
# Record for us-east-1aws route53 change-resource-record-sets \ --hosted-zone-id ZXYZ123 \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "SetIdentifier": "us-east-1", "Region": "us-east-1", "TTL": 60, "ResourceRecords": [{"Value": "54.123.45.67"}], "HealthCheckId": "hc-us-east-abc" } }] }'# Repeat for eu-west-1, ap-southeast-1, etc.Failover Routing
One PRIMARY record and one SECONDARY. Traffic goes to PRIMARY when healthy; Route 53 automatically switches to SECONDARY when the primary health check fails.
Route 53 Failover: ┌─────────────────────────────────────────────────────────┐ │ api.example.com │ │ │ │ PRIMARY: 54.123.45.67 (us-east-1) [health check: OK] │ │ ↓ if health check fails │ │ SECONDARY: 54.234.56.78 (us-west-2) [health check: OK]│ └─────────────────────────────────────────────────────────┘aws route53 change-resource-record-sets \ --hosted-zone-id ZXYZ123 \ --change-batch '{ "Changes": [{ "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "SetIdentifier": "primary", "Failover": "PRIMARY", "TTL": 30, "ResourceRecords": [{"Value": "54.123.45.67"}], "HealthCheckId": "hc-primary-abc" } }, { "Action": "CREATE", "ResourceRecordSet": { "Name": "api.example.com", "Type": "A", "SetIdentifier": "secondary", "Failover": "SECONDARY", "TTL": 30, "ResourceRecords": [{"Value": "54.234.56.78"}] } }] }'Geolocation Routing
Route traffic based on the geographic location of the DNS query. Useful for compliance (EU users must not leave the EU), language-specific content, or regional regulations.
Query from Germany → returns record for eu-west-1Query from Brazil → returns record for sa-east-1Query from US → returns record for us-east-1All others → returns default recordGeolocation routing works at continent, country, or US state level.
Geoproximity Routing
Similar to geolocation but allows bias adjustment — expand or shrink the geographic area each endpoint serves. Requires Traffic Flow (the visual policy editor). Useful for shifting traffic between regions for maintenance.
Multi-Value Answer Routing
Returns up to 8 healthy IP addresses in response to each DNS query. Clients randomly select one. Acts as a basic load balancer at the DNS level, without replacing ALB.
Real-World Scenario: Active-Active Multi-Region
A SaaS company runs API servers in us-east-1 and eu-west-1:
api.example.com (latency routing): us-east-1 record → ALB in us-east-1 (health check: /health on ALB) eu-west-1 record → ALB in eu-west-1 (health check: /health on ALB)
Outcome: US users → us-east-1 ALB (20ms latency) EU users → eu-west-1 ALB (15ms from Europe)
If eu-west-1 ALB health check fails: Route 53 removes eu-west-1 from responses EU users fall through to us-east-1 (100ms latency but functional)This active-active pattern with latency routing and health checks provides automatic regional failover without manual DNS changes.
Route 53 Resolver
Route 53 Resolver handles DNS within VPCs and enables hybrid DNS:
Inbound endpoints: Allow on-premises DNS servers to resolve AWS private hosted zones.
Outbound endpoints: Allow EC2 instances in your VPC to resolve on-premises domain names via forwarding rules.
# Create outbound resolver endpointaws route53resolver create-resolver-endpoint \ --creator-request-id $(date +%s) \ --security-group-ids sg-resolver \ --direction OUTBOUND \ --ip-addresses SubnetId=subnet-0a1b2c,Ip=10.0.1.50 SubnetId=subnet-0d4e5f,Ip=10.0.2.50
# Forward .corp.example.com queries to on-premises DNSaws route53resolver create-resolver-rule \ --creator-request-id $(date +%s)-rule \ --rule-type FORWARD \ --domain-name corp.example.com \ --target-ips Ip=192.168.1.53,Port=53 \ --resolver-endpoint-id rslvr-out-abc123Common Interview Questions
Q: What is the difference between a CNAME and an Alias record?
CNAME maps one hostname to another and cannot be used at the zone apex. Alias records are AWS-specific, point to AWS resource DNS names, work at the zone apex, and are free to query. Alias records with EvaluateTargetHealth: true also incorporate the health of the target.
Q: How fast does Route 53 failover when a health check fails? Health check interval is 10 or 30 seconds. With a failure threshold of 3, the fastest failover is 30 seconds (10s × 3 failures). DNS TTL affects client-side caching — a 30-second TTL means clients see the failover within 60 seconds of the health check failure.
Q: Can Route 53 route based on the content of a request? No. Route 53 works at the DNS level — it makes routing decisions based on the query source location, the routing policy, and health check results. Content-based routing (URL path, headers, cookies) requires an ALB with listener rules.
Q: What is the difference between geolocation and latency-based routing? Geolocation routes based on the geographic location of the DNS query (country or continent). Latency-based routing sends the query to the region with the lowest measured network latency. A user in Ireland might get eu-west-1 under geolocation but us-east-1 under latency routing if the measured latency is lower to the US. Latency routing is generally better for performance; geolocation is required for data residency compliance.