AWS Lambda: Serverless Functions Without the Infrastructure Management
Lambda is a compute service where you provide the code and AWS provides everything else. There is no AMI to select, no instance type to choose, no OS to patch. You write a function, define its memory allocation, configure what triggers it, and that’s the operational surface you manage.
The billing model matches this simplicity: you pay for the number of requests and the duration your function executes, rounded to the nearest millisecond. A function that runs 100ms costs roughly 0.002 cents. A function that never runs costs nothing.
How Lambda Executes Your Code
When an event triggers a Lambda function, AWS either reuses an existing execution environment or creates a new one:
First invocation (cold start): Event → Create environment → Initialise runtime → Run init code → Handle event
Subsequent invocations (warm): Event → Reuse existing environment → Handle event (no init code again)An execution environment is an isolated container with your function code, its dependencies, and the runtime. AWS keeps it alive for a period after the function completes, so the next invocation reuses it without the cold start cost.
The init code (code outside your handler function) runs once per environment. This is where you initialise database connections, load configuration, and set up SDK clients — things you want to pay for once, not per request.
import boto3import os
# This runs once per execution environment (cold start only)dynamodb = boto3.resource('dynamodb')table = dynamodb.Table(os.environ['TABLE_NAME'])
def lambda_handler(event, context): # This runs on every invocation user_id = event['userId'] response = table.get_item(Key={'userId': user_id}) return response.get('Item', {})Event Sources and Triggers
Lambda functions are invoked in one of three ways:
Synchronous invocation: The caller waits for the function to complete and gets the return value. API Gateway, Cognito authorisers, and direct SDK calls use this model. Errors are returned to the caller.
Asynchronous invocation: The caller puts the event in an internal queue and returns immediately. Lambda retries on failure (up to 2 retries by default). S3 events, SNS, and EventBridge use this model.
Polling invocation: Lambda polls a source and batches records. SQS, Kinesis, DynamoDB Streams, and Kafka use this model. Lambda manages the polling, so you do not need separate polling infrastructure.
┌──────────────────────────────────────────────────────────────┐│ Lambda Trigger Map ││ ││ SYNCHRONOUS ASYNCHRONOUS POLLING ││ ───────────────────────────────────────────────────────── ││ API Gateway S3 Events SQS Queue ││ ALB SNS Topic Kinesis Stream ││ Cognito EventBridge DynamoDB Stream ││ CloudFront CodePipeline MSK / Kafka ││ Direct SDK call IoT Core MQ (ActiveMQ) │└──────────────────────────────────────────────────────────────┘Writing a Lambda Function
A Lambda function needs a handler — a function AWS calls on each invocation. The handler receives an event (the trigger payload) and a context object (metadata about the invocation).
def lambda_handler(event, context): # event: the trigger payload (dict for most sources) # context: invocation metadata # context.aws_request_id # context.get_remaining_time_in_millis()
print(f"Request ID: {context.aws_request_id}") print(f"Time remaining: {context.get_remaining_time_in_millis()}ms")
return { 'statusCode': 200, 'body': 'OK' }Deploying a Lambda Function
# Package your codezip function.zip lambda_function.py
# Create the functionaws lambda create-function \ --function-name process-orders \ --runtime python3.12 \ --role arn:aws:iam::123456789012:role/lambda-exec-role \ --handler lambda_function.lambda_handler \ --zip-file fileb://function.zip \ --memory-size 256 \ --timeout 30 \ --environment Variables='{TABLE_NAME=orders,REGION=us-east-1}'For larger deployments with dependencies, use Lambda Layers or a container image:
# Deploy as container image (supports up to 10 GB)aws lambda create-function \ --function-name ml-inference \ --package-type Image \ --code ImageUri=123456789012.dkr.ecr.us-east-1.amazonaws.com/ml-model:latest \ --role arn:aws:iam::123456789012:role/lambda-exec-role \ --memory-size 3008 \ --timeout 60Memory, CPU, and Timeout
Lambda allocates CPU proportional to memory. At 128 MB, you get a fraction of a vCPU. At 1,769 MB, you get one full vCPU. At 3,538 MB, two vCPUs.
For CPU-bound work, increasing memory reduces duration enough to offset the higher per-GB-second cost. Test with AWS Lambda Power Tuning (an open-source Step Functions workflow) to find the optimal memory setting.
Maximum timeout is 15 minutes. For anything longer, consider Step Functions, ECS tasks, or breaking the work into smaller Lambda invocations.
Environment Variables and Configuration
Lambda supports environment variables for configuration — connection strings, feature flags, third-party API keys. These are encrypted at rest with the Lambda service key (or a customer-managed KMS key if you configure it).
For secrets that rotate, use AWS Secrets Manager and fetch the secret at function init time (not per invocation):
import boto3import json
client = boto3.client('secretsmanager')secret = json.loads( client.get_secret_value(SecretId='prod/db-password')['SecretString'])DB_PASSWORD = secret['password']
def lambda_handler(event, context): # use DB_PASSWORD passReal-World Scenario: Image Processing Pipeline
A photography platform needs to generate thumbnails whenever a user uploads a photo:
- User uploads to S3 bucket
originals/ - S3 event triggers Lambda function
resize-image - Lambda downloads the original, creates 3 sizes (thumb, medium, large) using Pillow
- Lambda stores results in S3
resized/ - Lambda writes metadata to DynamoDB
import boto3from PIL import Imageimport io
s3 = boto3.client('s3')dynamodb = boto3.resource('dynamodb')table = dynamodb.Table('photo-metadata')
SIZES = {'thumb': (150, 150), 'medium': (800, 600), 'large': (1600, 1200)}
def lambda_handler(event, context): bucket = event['Records'][0]['s3']['bucket']['name'] key = event['Records'][0]['s3']['object']['key']
# Download original obj = s3.get_object(Bucket=bucket, Key=key) image = Image.open(io.BytesIO(obj['Body'].read()))
photo_id = key.split('/')[-1].split('.')[0]
for size_name, dimensions in SIZES.items(): resized = image.copy() resized.thumbnail(dimensions)
buffer = io.BytesIO() resized.save(buffer, format='JPEG', quality=85) buffer.seek(0)
dest_key = f"resized/{size_name}/{photo_id}.jpg" s3.put_object(Bucket='photo-resized', Key=dest_key, Body=buffer)
# Update metadata table.update_item( Key={'photoId': photo_id}, UpdateExpression='SET #s = :s', ExpressionAttributeNames={'#s': 'status'}, ExpressionAttributeValues={':s': 'processed'} )
return {'status': 'success', 'photoId': photo_id}Cold Starts: What They Are and When They Matter
A cold start occurs when Lambda creates a new execution environment. The delay is typically 100–500ms for interpreted languages and 500ms–2s for JVM-based languages. For most asynchronous workloads, this is unnoticeable. For synchronous API calls where users wait for a response, it can cause occasional latency spikes.
Mitigations:
- Provisioned Concurrency: pre-warms execution environments, eliminates cold starts, costs extra
- Smaller deployment packages: smaller packages initialise faster
- Interpreted runtimes: Python and Node cold starts are faster than Java or .NET
- Keep-alive patterns: a scheduled event pings the function every few minutes (not reliable)
For customer-facing APIs where p99 latency matters, provisioned concurrency is the only reliable solution.
Lambda and IAM
Lambda needs an execution role — an IAM role it assumes when running. This role controls what AWS APIs the function can call. Follow least privilege:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::originals-bucket/*" }, { "Effect": "Allow", "Action": ["s3:PutObject"], "Resource": "arn:aws:s3:::resized-bucket/*" }, { "Effect": "Allow", "Action": ["dynamodb:UpdateItem"], "Resource": "arn:aws:dynamodb:us-east-1:123456789:table/photo-metadata" } ]}Never put IAM access keys inside Lambda code. The execution role provides temporary credentials automatically through the EC2 metadata service.
Common Interview Questions
Q: What is the maximum execution duration for a Lambda function? 15 minutes. For longer-running work, use Step Functions with multiple Lambda invocations, ECS Fargate tasks, or EC2.
Q: How does Lambda scale? Lambda scales by running additional execution environments in parallel. Account default is 1,000 concurrent executions per region. You can request increases. Reserved concurrency caps a specific function’s maximum.
Q: What is the difference between synchronous and asynchronous invocation? Synchronous: the caller waits for the result (API Gateway, direct SDK). Asynchronous: Lambda queues the event and returns immediately; it retries on failure. Error handling differs significantly between the two.
Q: How do you share code between multiple Lambda functions?
Lambda Layers — a zip archive that gets mounted at /opt/ in the execution environment. Up to five layers per function, each up to 250 MB unzipped.