Cloud  /  AWS

AWS Amazon Web Services 61 guides · updated 2026

Hands-on guides to compute, storage, databases, networking, and serverless on the world's most widely adopted cloud platform.

Amazon Kinesis: Real-Time Data Streaming, Analytics, and Video Ingestion

Batch processing works fine when data arrives in daily dumps. It breaks down the moment you need to react to something within seconds — a fraud signal, a server error spike, a user abandoning a checkout flow. Amazon Kinesis is Amazon’s answer to that problem: a family of four managed services for ingesting, processing, and analysing data streams in real time.

The four services address different parts of the problem. Understanding which one to use — and when to combine them — is the practical skill this article builds.


The Four Kinesis Services

+---------------------+ +---------------------+
| Kinesis Data | | Kinesis Data |
| Streams | | Firehose |
| | | |
| Custom consumers | | Managed delivery to |
| Full control | | S3, Redshift, |
| Millisecond latency | | OpenSearch, Splunk |
+---------------------+ +---------------------+
+---------------------+ +---------------------+
| Kinesis Data | | Kinesis Video |
| Analytics | | Streams |
| | | |
| SQL or Flink on | | Ingest, store, and |
| streams | | process video from |
| Real-time windowing | | cameras and devices |
+---------------------+ +---------------------+

Kinesis Data Streams

Kinesis Data Streams (KDS) is the core service — a durable, real-time data transport. Producers push records into the stream; consumers read from it.

Shards: A stream is divided into shards. Each shard provides:

If you need more throughput, you add shards. A stream with 10 shards can ingest 10,000 records/second. You can split or merge shards without downtime.

Producer A ─────────────┐
Producer B ────────────┐| +----------+ +-----------+
Producer C ───────────┐|| | Shard 1 | | Consumer 1|
||| | Shard 2 | | (Lambda) |
|||───>| Shard 3 |───>| Consumer 2|
||| | ... | | (KCL app) |
||| +----------+ +-----------+
Stream

Retention: Data is retained for 24 hours by default. You can extend this to 7 days (standard) or 365 days (extended). Extended retention costs more per GB-hour but is valuable for replay and audit scenarios.

Consumers: There are two consumption modes:

Sequence numbers: Every record written to KDS gets a sequence number that is monotonically increasing within a shard. Consumers track their position by sequence number, which is how they resume after a failure or restart.


Kinesis Data Firehose

Firehose is the managed delivery service. You do not manage consumers or shards — Firehose buffers records, optionally transforms them with a Lambda function, and delivers them to a destination.

Delivery destinations:

Producers
|
v
[Kinesis Firehose]
|
+-- Optional Lambda transform
| (filter, enrich, convert format)
|
+-- Buffer (by size or time)
| e.g., 5 MB or 60 seconds
|
v
Destination: S3 / Redshift / OpenSearch / Splunk

Firehose buffers records until either a size threshold or a time threshold is met, then flushes the batch to the destination. Minimum buffer time is 60 seconds. This means Firehose is near-real-time, not real-time — if your use case requires sub-minute latency, use KDS with a custom consumer.

Firehose supports automatic format conversion: it can convert JSON to Parquet or ORC before writing to S3 using a schema from the Glue Data Catalogue. This eliminates the need for a separate conversion job.

When to use Firehose instead of KDS: When you need simple delivery to S3, Redshift, or OpenSearch without writing consumer code. Firehose handles scaling, retries, and format conversion. When you need custom processing logic or sub-minute latency, use KDS.


Kinesis Data Analytics

Kinesis Data Analytics (KDA) lets you process data streams using SQL or Apache Flink without managing infrastructure.

KDA for SQL (classic): Write standard SQL queries against an input stream. Use time-based windows (tumbling, sliding, session) to aggregate records over intervals. The output goes to another Kinesis stream or Firehose.

KDA for Apache Flink (Studio): A managed Apache Flink environment where you write Flink applications in Java, Python, or Scala. More powerful than SQL — supports complex stateful processing, custom windowing, joins across streams, and exactly-once semantics.

Example use case: A payment processor wants to flag any merchant whose transaction volume spikes more than 3x their 10-minute average. KDA for SQL:

CREATE OR REPLACE STREAM "ANOMALY_STREAM" (
merchant_id VARCHAR(32),
tx_count INTEGER,
avg_count DOUBLE
);
CREATE OR REPLACE PUMP "ANOMALY_PUMP" AS
INSERT INTO "ANOMALY_STREAM"
SELECT STREAM
merchant_id,
COUNT(*) AS tx_count,
AVG(COUNT(*)) OVER (ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) AS avg_count
FROM "SOURCE_STREAM"
GROUP BY merchant_id, STEP("SOURCE_STREAM".ROWTIME BY INTERVAL '1' MINUTE)
HAVING COUNT(*) > AVG(COUNT(*)) OVER (...) * 3;

Kinesis Video Streams

Kinesis Video Streams (KVS) is for ingesting, storing, and processing video, audio, and time-serialised data from cameras and IoT devices. It handles the protocol complexity of device connections and provides APIs for real-time streaming and on-demand playback.

Use cases:

KVS integrates with Amazon Rekognition Video for real-time object and face detection, and with SageMaker for custom ML inference on video frames.


Choosing the Right Kinesis Service

Need to process data in real time with custom code?
└── Kinesis Data Streams
Need to deliver data to S3, Redshift, or OpenSearch without writing consumers?
└── Kinesis Data Firehose
Need to run SQL or Flink queries against a stream?
└── Kinesis Data Analytics
Ingesting video or audio from cameras or devices?
└── Kinesis Video Streams

Common combination: IoT devices → KDS → (KDA for real-time anomaly detection AND Firehose for S3 archival). Both KDA and Firehose read from the same KDS stream independently.


Real-World Scenario: E-Commerce Real-Time Dashboard

An e-commerce platform needs a live dashboard showing orders per minute, revenue in the last 5 minutes, and alerts when the error rate on checkout exceeds 2%.

Architecture:

This separates the real-time path (KDA + Lambda + DynamoDB) from the batch analytics path (Firehose + S3 + Athena), both reading from the same source stream.


Interview Notes

Q: How do you calculate the number of shards you need? Divide the peak ingestion rate by 1,000 records/second (or 1 MB/second, whichever is the binding constraint). Add headroom — typically 20-25% — for unexpected traffic spikes. If you have 5,000 records/second at peak, you need at least 5 shards, so 6-7 with headroom.

Q: What is the difference between Kinesis and SQS? SQS is a message queue — one consumer reads a message, it is deleted. Kinesis is a stream — multiple consumers can read the same record independently, and records are retained for hours or days. SQS is for task distribution; Kinesis is for stream processing with multiple downstream consumers.

Q: What happens when a shard is hot (one shard receives disproportionate traffic)? The shard hits its 1 MB/second or 1,000 records/second limit and producers receive ProvisionedThroughputExceededException errors. Fix this by choosing a better partition key that distributes records more evenly across shards, or by adding a random suffix to the partition key.

Q: What is enhanced fan-out and when do you need it? Standard KDS consumers share 2 MB/second read throughput per shard across all consumers. Enhanced fan-out gives each registered consumer its own dedicated 2 MB/second per shard via HTTP/2 push. Use it when you have three or more independent consumers reading the same stream and standard throughput is insufficient.