AWS Interview Questions and Answers

From associate-level cloud concepts to architect-level design decisions — these questions reflect what hiring teams test across cloud engineer, DevOps, data engineer, and solutions architect roles.

Core Concepts

Q1. What is the difference between horizontal and vertical scaling in AWS?

Vertical scaling (scaling up) — upgrade to a larger instance type (e.g., t3.medium → m5.xlarge). Simple but has ceiling limits and requires downtime for EC2.

Horizontal scaling (scaling out) — add more instances behind a load balancer. AWS handles this via Auto Scaling Groups (ASG) with EC2, or automatically with services like Lambda, ECS Fargate, and DynamoDB.

Horizontal scaling is preferred for cloud-native architectures because it:

Has no practical ceiling
Maintains availability during scaling events
Distributes failure blast radius

Q2. Explain the differences between S3 storage classes.

Storage Class	Use Case	Retrieval	Min Duration
S3 Standard	Frequently accessed data	Instant	None
S3 Intelligent-Tiering	Unknown/changing access patterns	Instant	None
S3 Standard-IA	Infrequent access, must be fast	Instant	30 days
S3 One Zone-IA	Infrequent, non-critical	Instant	30 days
S3 Glacier Instant	Archive with ms retrieval	Milliseconds	90 days
S3 Glacier Flexible	Archive, minutes–hours retrieval	Minutes–hours	90 days
S3 Glacier Deep Archive	Lowest cost, long-term archive	Up to 12h	180 days

S3 Lifecycle Policies automate transitions between classes based on object age.

Q3. What is the difference between Security Groups and Network ACLs?

Feature	Security Group	Network ACL
Level	Instance level	Subnet level
State	Stateful (return traffic auto-allowed)	Stateless (must allow inbound AND outbound explicitly)
Rules	Allow rules only	Allow and Deny rules
Evaluation	All rules evaluated	Rules evaluated in number order; first match wins
Default	Deny all inbound, allow all outbound	Allow all in/out

Typical pattern: Security Groups for fine-grained per-instance control; NACLs as an extra layer to block known bad IP ranges at the subnet boundary.

Q4. Describe the components of a VPC.

A VPC is a logically isolated virtual network within an AWS region:

Subnets — segments of the VPC’s CIDR block in a single AZ. Public subnets have a route to an Internet Gateway; private subnets don’t.
Internet Gateway (IGW) — enables bidirectional traffic between VPC resources and the internet
NAT Gateway — allows private subnet resources to initiate outbound internet connections without being directly reachable from the internet
Route Tables — define where traffic is directed (e.g., 0.0.0.0/0 → IGW for public subnets)
VPC Peering — private connection between two VPCs (same or different accounts/regions)
Transit Gateway — hub-and-spoke for connecting many VPCs and on-premise networks
VPC Endpoints — private connectivity to AWS services without leaving the AWS network (Gateway endpoints for S3/DynamoDB; Interface endpoints via PrivateLink for others)

Q5. What is IAM and what are its key components?

IAM (Identity and Access Management) controls who can do what in your AWS account:

Users — individual human identities with permanent credentials
Groups — collection of users sharing the same permissions
Roles — assumed by services (EC2, Lambda), users, or cross-account principals; temporary credentials via STS
Policies — JSON documents defining Allow/Deny permissions on specific actions and resources
Permission boundaries — cap the maximum permissions a user or role can have, regardless of attached policies

Best practices: follow least-privilege, use roles for all service-to-service authentication, require MFA for human users, never use root account for daily work.

Compute

Q6. When would you choose Lambda vs EC2 vs ECS Fargate?

Service	Choose when
Lambda	Event-driven, short-lived (≤15 min), variable/spiky traffic, no infrastructure management
EC2	Long-running processes, specific OS/kernel requirements, need for persistent local storage, GPU workloads
ECS Fargate	Containerized workloads, team knows Docker, want managed infrastructure without EC2 management
EKS	Kubernetes required, complex microservices, multi-cloud portability needed

Lambda’s cold start (100ms–3s) matters for latency-sensitive APIs — use Provisioned Concurrency to eliminate it.

Q7. What is an Auto Scaling Group and how does it scale?

An ASG maintains a fleet of EC2 instances between minimum and maximum limits, automatically adding or removing instances based on:

Target tracking — maintain a target metric value (e.g., keep CPU at 50%). Simplest to configure.
Step scaling — scale by specific amounts when alarms breach thresholds (e.g., add 2 instances when CPU >70%, add 5 when CPU >85%)
Scheduled scaling — scale at predictable times (e.g., add capacity at 8 AM on weekdays)
Predictive scaling — ML-based forecasting of demand to scale proactively

Lifecycle hooks allow running custom scripts during instance launch or termination (e.g., drain connections before termination).

Storage & Databases

Q8. What is the difference between RDS and DynamoDB?

Aspect	RDS	DynamoDB
Type	Relational (SQL)	NoSQL (key-value + document)
Schema	Fixed, predefined	Flexible per item
Scaling	Vertical + read replicas	Horizontal, automatic
Consistency	Strong by default	Eventual (configurable to strong)
Query flexibility	Full SQL	Limited to primary key + GSIs
Best for	Complex queries, transactions, relational data	High-throughput, simple access patterns, gaming, sessions

RDS Multi-AZ provides synchronous replication for HA. RDS Read Replicas (async) offload read traffic.

Q9. Explain S3 versioning and how it protects against accidental deletion.

When versioning is enabled on an S3 bucket:

Every PUT creates a new version with a unique version ID
DELETE on an object adds a delete marker (soft delete) — the object is hidden but not gone
To permanently delete a versioned object, you must delete a specific version ID

Restoring a deleted file: delete the delete marker to make the previous version current again.

Lifecycle rules can automatically expire old versions after N days to control storage costs.

For extra protection: enable S3 Object Lock (WORM — Write Once Read Many) for immutable compliance storage.

High Availability & Architecture

Q10. What is the difference between an Application Load Balancer, Network Load Balancer, and Gateway Load Balancer?

ALB	NLB	GWLB
Layer 7 (HTTP/HTTPS/WebSocket)	Layer 4 (TCP/UDP/TLS)	Layer 3 (IP)
Content-based routing (path, host, header)	Ultra-low latency, millions of req/sec	Route traffic through third-party appliances
Best for REST APIs, microservices	Best for gaming, IoT, financial services	Best for firewalls, intrusion detection

ALB target groups can be EC2, Lambda, containers, or IPs. NLB can preserve the client IP without X-Forwarded-For headers.

Q11. What is CloudFront and how does it work?

CloudFront is AWS’s CDN — it caches content at 400+ edge locations worldwide to reduce latency for end users.

Flow: User request → Nearest Edge Location → If cached: serve directly; If not: fetch from origin (S3, ALB, EC2) → cache → serve

Key features:

Origin Shield — intermediate caching layer to reduce origin load
Lambda@Edge / CloudFront Functions — run code at edge for auth, redirects, A/B testing
Signed URLs/Cookies — restrict content access to authorized users
WAF integration — filter malicious traffic at the edge

Cache behavior control: Cache-Control headers from origin, or TTL settings in CloudFront distribution.

Monitoring & Costs

Q12. What are the key AWS cost optimization strategies?

Right-sizing: analyze CloudWatch metrics to find over-provisioned instances. Use AWS Compute Optimizer for recommendations.

Savings Plans & Reserved Instances: commit to consistent usage for 1–3 years for up to 72% discount.

Spot Instances: up to 90% discount for fault-tolerant, interruptible workloads (batch processing, CI/CD, rendering).

S3 cost optimization: lifecycle policies to move data to cheaper tiers; S3 Intelligent-Tiering for unknown access patterns.

Data transfer: keep traffic within the same AZ where possible; use VPC Endpoints to avoid NAT Gateway charges for S3/DynamoDB; use CloudFront to cache at edge.

Cost monitoring: AWS Cost Explorer for trends; AWS Budgets for alerts; Cost Allocation Tags to attribute costs to teams/projects.

Q13. How does CloudWatch differ from CloudTrail?

CloudWatch	CloudTrail
Performance and operational monitoring	Audit and governance log of API calls
Metrics, logs, alarms, dashboards	”Who did what, when, from where”
Monitor CPU, memory, latency, errors	Record EC2 start/stop, S3 bucket policy change, IAM role assumption
Action: trigger autoscaling, SNS alerts	Action: compliance audits, security investigation

Both should be enabled: CloudWatch for operational alerting, CloudTrail for security and compliance auditing. CloudTrail logs should be sent to a separate, protected S3 bucket with Object Lock for tamper-proof audit logs.

Q14. Describe the Shared Responsibility Model.

AWS is responsible for the cloud:

Physical data centers, hardware, networking infrastructure
Hypervisor and managed service infrastructure
AZ and region fault isolation

Customer is responsible in the cloud:

Guest OS patches (EC2)
Application security, data encryption, network access controls
IAM configuration, MFA enforcement
Data classification and backup

For managed services (RDS, Lambda, S3), AWS takes more responsibility (OS, patching, replication), but the customer remains responsible for access control, encryption settings, and application-level security.

Q15. What is AWS Well-Architected Framework and its pillars?

A set of design principles and best practices for building reliable, secure, efficient, and cost-effective systems on AWS:

Operational Excellence — run and monitor systems, continually improve
Security — protect data, systems, and assets via IAM, encryption, monitoring
Reliability — recover from failures, scale to meet demand, manage change
Performance Efficiency — use resources efficiently, select right instance types
Cost Optimization — avoid unnecessary costs, understand spending over time
Sustainability (added 2021) — minimize environmental impact

The Well-Architected Tool in the AWS console performs reviews against these pillars and surfaces actionable recommendations.