Amazon ElastiCache: In-Memory Caching for Redis and Memcached Workloads
A database query that takes 50ms on a quiet server takes 500ms when fifty clients run it simultaneously because the database is now doing fifty disk reads instead of one. Caching solves this by keeping the results of expensive operations in memory — the second request for the same data takes a fraction of a millisecond because there is no disk read, no query execution, no result serialization. Just a memory lookup.
Amazon ElastiCache is the managed AWS service for deploying and operating Redis or Memcached in your VPC. Managed means AWS handles the infrastructure, patching, monitoring, failover, and backups — you configure the cluster and write the cache integration logic in your application.
Redis vs Memcached: Choosing an Engine
Both engines store key-value data in memory. The similarity ends there.
Memcached is a distributed memory cache — nothing more. It stores string values (typically serialized JSON, msgpack, or similar). It scales horizontally by adding nodes. When a node fails, its data is gone (no persistence, no replication). Use it when you need the simplest possible cache with the lowest latency and no other requirements.
Redis is a data structure server. Values can be strings, hashes, lists, sorted sets, bitmaps, geospatial indexes, streams, and more. Redis supports persistence (write data to disk), replication (primary with multiple replicas), Pub/Sub messaging, transactions, and Lua scripting. Redis Cluster (cluster mode enabled) shards data across multiple primary nodes for horizontal scaling.
Redis vs Memcached Decision ============================
Need persistence (survive restart)? → Redis Need replication / HA? → Redis Need pub/sub messaging? → Redis Need complex data structures? → Redis Need sorted sets (leaderboards, rankings)? → Redis Need simple, pure caching only? → Memcached or Redis Need multi-threaded memory efficiency? → Memcached Migrating from existing Redis deployment? → Redis
Short answer: Redis is almost always the right choice. Choose Memcached only if you have a specific reason.Caching Patterns
The mechanics of caching — when to read from cache, when to go to the database, what to do on a cache miss — are formalized into patterns.
Cache-aside (lazy loading): the application checks the cache first. On a miss, it queries the database, writes the result to cache, and returns it. The cache only contains data that has been requested at least once.
Cache-Aside Pattern ===================
Application │ ├── 1. GET user:1001 from cache │ │ │ Cache HIT → return immediately │ │ │ Cache MISS → │ │ ├── 2. SELECT * FROM users WHERE id = 1001 (database) │ │ └── 3. SET user:1001 = {result} with TTL=300s in cache then return to caller
Advantage: cache only holds requested data Disadvantage: cache miss adds extra latency (cache + DB)Write-through: every write to the database also writes to the cache. The cache is always current. The trade-off is that every write is slower (two writes instead of one) and the cache holds data whether it is ever read or not.
Write-behind (write-back): the application writes to the cache, and the cache asynchronously writes to the database. Very fast writes but complex consistency semantics and risk of data loss if the cache fails before writing to the database.
Read-through: the application reads only from cache. On a miss, the cache itself queries the database and populates itself. The application code does not know whether it is talking to cache or database directly.
For most web applications, cache-aside is the right starting point. It is straightforward, predictable, and easy to reason about during debugging.
ElastiCache Redis Architecture
Replication group: a Redis primary node and up to 5 replica nodes. The primary accepts reads and writes; replicas serve reads. If the primary fails, ElastiCache automatically promotes a replica using Redis Sentinel.
Cluster mode disabled: one primary, up to 5 replicas, stores all data in a single shard. Scale vertically by changing instance type.
Cluster mode enabled (Redis Cluster): data is sharded across up to 500 nodes (250 primary + replicas). Each primary owns a subset of the 16,384 hash slots. Scale horizontally by adding shards. Useful when the dataset exceeds what a single node can hold, or when write throughput exceeds a single node’s capacity.
Redis Cluster Mode ====================
Cluster Mode Disabled (single shard): Primary ──replication──► Replica-1 └──► Replica-2 All 16,384 hash slots on one primary
Cluster Mode Enabled (multi-shard): Shard-1: Primary-A, Replica-A1 (slots 0-5460)
Shard-2: Primary-B, Replica-B1 (slots 5461-10922)
Shard-3: Primary-C, Replica-C1 (slots 10923-16383)
Each key hashes to exactly one shard Horizontal scaling by adding shardsCommon Use Cases
Session storage: web applications need to store user session data — login state, shopping cart, preferences. Sessions can be stored in Redis so any application server can retrieve them, enabling stateless EC2 instances in an Auto Scaling group. Sessions expire automatically via Redis TTL.
Database query caching: expensive aggregations, reporting queries, and frequently-read database records cached in Redis reduce database load. The cache-aside pattern is standard here. Set TTL based on how stale the data can be — product prices might be cached for 60 seconds, article content for 5 minutes, leaderboard results for 1 second.
Rate limiting: Redis INCR and EXPIRE commands implement per-user or per-IP rate limiting. Check the counter, increment it, set expiry. If the count exceeds the limit, reject the request. Atomic Redis operations prevent race conditions.
Leaderboards: Redis sorted sets are tailor-made for leaderboards. ZADD inserts a score, ZRANK returns a user’s rank, ZRANGE returns the top-N users. All operations are O(log N) — fast regardless of leaderboard size.
Pub/Sub: Redis Pub/Sub enables real-time messaging between application components. A game server publishes player events; connected clients subscribed to the channel receive them instantly. This is simpler than SQS for use cases where message durability is not required.
Real-World Use Case: E-Commerce Flash Sale
A retailer runs a flash sale where 100,000 users attempt to buy a limited inventory item simultaneously. Without caching:
- Each page load queries the database for current price and inventory
- 100,000 concurrent reads overwhelm the RDS instance
- Write conflicts during checkout corrupt inventory counts
With ElastiCache Redis:
- Product price and inventory cached in Redis (TTL = 1 second, aggressive refresh)
- Rate limiter in Redis prevents any single user from submitting more than 5 requests per second
- Inventory counter uses Redis DECR (atomic decrement) to prevent overselling — when the counter hits zero, further checkout attempts are rejected before touching the database
The RDS instance handles only the actual order writes, not the constant “is this item available?” reads.
Key Interview Points
- ElastiCache is not persistent storage — use it as a cache in front of a durable database, not as the system of record (except for Redis with persistence enabled and appropriate backup strategy)
- Eviction policies: when memory fills up, Redis uses configurable eviction policies (LRU, LFU, random). Set these deliberately or Redis will refuse writes when full
- Connection pooling: each Redis connection consumes memory on the server and involves TCP overhead. Use connection pooling in your application; do not create a new connection per request
- Multi-AZ failover: with Multi-AZ enabled, ElastiCache automatically promotes a replica when the primary fails, typically in under 60 seconds
- ElastiCache vs DAX: DAX is a DynamoDB-specific cache; ElastiCache is general-purpose. Use DAX when you only need to cache DynamoDB reads; use ElastiCache for broader caching needs
- In-transit and at-rest encryption: ElastiCache Redis supports both, but encryption adds overhead. Enable it for sensitive data; evaluate the latency impact for performance-critical paths
- CloudWatch metrics to monitor:
CacheMisses,CacheHits,EvictionCount,FreeableMemory,ReplicationLag