Interviews

🎯 Interview Guides 12 guides · updated 2026

Real questions and structured answers for data, cloud, and AI engineering interviews — including the system-design and GenAI rounds now showing up everywhere.

AWS Interview Questions and Answers

From associate-level cloud concepts to architect-level design decisions — these questions reflect what hiring teams test across cloud engineer, DevOps, data engineer, and solutions architect roles.


Core Concepts

Q1. What is the difference between horizontal and vertical scaling in AWS?

Vertical scaling (scaling up) — upgrade to a larger instance type (e.g., t3.mediumm5.xlarge). Simple but has ceiling limits and requires downtime for EC2.

Horizontal scaling (scaling out) — add more instances behind a load balancer. AWS handles this via Auto Scaling Groups (ASG) with EC2, or automatically with services like Lambda, ECS Fargate, and DynamoDB.

Horizontal scaling is preferred for cloud-native architectures because it:


Q2. Explain the differences between S3 storage classes.

Storage ClassUse CaseRetrievalMin Duration
S3 StandardFrequently accessed dataInstantNone
S3 Intelligent-TieringUnknown/changing access patternsInstantNone
S3 Standard-IAInfrequent access, must be fastInstant30 days
S3 One Zone-IAInfrequent, non-criticalInstant30 days
S3 Glacier InstantArchive with ms retrievalMilliseconds90 days
S3 Glacier FlexibleArchive, minutes–hours retrievalMinutes–hours90 days
S3 Glacier Deep ArchiveLowest cost, long-term archiveUp to 12h180 days

S3 Lifecycle Policies automate transitions between classes based on object age.


Q3. What is the difference between Security Groups and Network ACLs?

FeatureSecurity GroupNetwork ACL
LevelInstance levelSubnet level
StateStateful (return traffic auto-allowed)Stateless (must allow inbound AND outbound explicitly)
RulesAllow rules onlyAllow and Deny rules
EvaluationAll rules evaluatedRules evaluated in number order; first match wins
DefaultDeny all inbound, allow all outboundAllow all in/out

Typical pattern: Security Groups for fine-grained per-instance control; NACLs as an extra layer to block known bad IP ranges at the subnet boundary.


Q4. Describe the components of a VPC.

A VPC is a logically isolated virtual network within an AWS region:


Q5. What is IAM and what are its key components?

IAM (Identity and Access Management) controls who can do what in your AWS account:

Best practices: follow least-privilege, use roles for all service-to-service authentication, require MFA for human users, never use root account for daily work.


Compute

Q6. When would you choose Lambda vs EC2 vs ECS Fargate?

ServiceChoose when
LambdaEvent-driven, short-lived (≤15 min), variable/spiky traffic, no infrastructure management
EC2Long-running processes, specific OS/kernel requirements, need for persistent local storage, GPU workloads
ECS FargateContainerized workloads, team knows Docker, want managed infrastructure without EC2 management
EKSKubernetes required, complex microservices, multi-cloud portability needed

Lambda’s cold start (100ms–3s) matters for latency-sensitive APIs — use Provisioned Concurrency to eliminate it.


Q7. What is an Auto Scaling Group and how does it scale?

An ASG maintains a fleet of EC2 instances between minimum and maximum limits, automatically adding or removing instances based on:

Lifecycle hooks allow running custom scripts during instance launch or termination (e.g., drain connections before termination).


Storage & Databases

Q8. What is the difference between RDS and DynamoDB?

AspectRDSDynamoDB
TypeRelational (SQL)NoSQL (key-value + document)
SchemaFixed, predefinedFlexible per item
ScalingVertical + read replicasHorizontal, automatic
ConsistencyStrong by defaultEventual (configurable to strong)
Query flexibilityFull SQLLimited to primary key + GSIs
Best forComplex queries, transactions, relational dataHigh-throughput, simple access patterns, gaming, sessions

RDS Multi-AZ provides synchronous replication for HA. RDS Read Replicas (async) offload read traffic.


Q9. Explain S3 versioning and how it protects against accidental deletion.

When versioning is enabled on an S3 bucket:

Restoring a deleted file: delete the delete marker to make the previous version current again.

Lifecycle rules can automatically expire old versions after N days to control storage costs.

For extra protection: enable S3 Object Lock (WORM — Write Once Read Many) for immutable compliance storage.


High Availability & Architecture

Q10. What is the difference between an Application Load Balancer, Network Load Balancer, and Gateway Load Balancer?

ALBNLBGWLB
Layer 7 (HTTP/HTTPS/WebSocket)Layer 4 (TCP/UDP/TLS)Layer 3 (IP)
Content-based routing (path, host, header)Ultra-low latency, millions of req/secRoute traffic through third-party appliances
Best for REST APIs, microservicesBest for gaming, IoT, financial servicesBest for firewalls, intrusion detection

ALB target groups can be EC2, Lambda, containers, or IPs. NLB can preserve the client IP without X-Forwarded-For headers.


Q11. What is CloudFront and how does it work?

CloudFront is AWS’s CDN — it caches content at 400+ edge locations worldwide to reduce latency for end users.

Flow: User request → Nearest Edge Location → If cached: serve directly; If not: fetch from origin (S3, ALB, EC2) → cache → serve

Key features:

Cache behavior control: Cache-Control headers from origin, or TTL settings in CloudFront distribution.


Monitoring & Costs

Q12. What are the key AWS cost optimization strategies?

Right-sizing: analyze CloudWatch metrics to find over-provisioned instances. Use AWS Compute Optimizer for recommendations.

Savings Plans & Reserved Instances: commit to consistent usage for 1–3 years for up to 72% discount.

Spot Instances: up to 90% discount for fault-tolerant, interruptible workloads (batch processing, CI/CD, rendering).

S3 cost optimization: lifecycle policies to move data to cheaper tiers; S3 Intelligent-Tiering for unknown access patterns.

Data transfer: keep traffic within the same AZ where possible; use VPC Endpoints to avoid NAT Gateway charges for S3/DynamoDB; use CloudFront to cache at edge.

Cost monitoring: AWS Cost Explorer for trends; AWS Budgets for alerts; Cost Allocation Tags to attribute costs to teams/projects.


Q13. How does CloudWatch differ from CloudTrail?

CloudWatchCloudTrail
Performance and operational monitoringAudit and governance log of API calls
Metrics, logs, alarms, dashboards”Who did what, when, from where”
Monitor CPU, memory, latency, errorsRecord EC2 start/stop, S3 bucket policy change, IAM role assumption
Action: trigger autoscaling, SNS alertsAction: compliance audits, security investigation

Both should be enabled: CloudWatch for operational alerting, CloudTrail for security and compliance auditing. CloudTrail logs should be sent to a separate, protected S3 bucket with Object Lock for tamper-proof audit logs.


Q14. Describe the Shared Responsibility Model.

AWS is responsible for the cloud:

Customer is responsible in the cloud:

For managed services (RDS, Lambda, S3), AWS takes more responsibility (OS, patching, replication), but the customer remains responsible for access control, encryption settings, and application-level security.


Q15. What is AWS Well-Architected Framework and its pillars?

A set of design principles and best practices for building reliable, secure, efficient, and cost-effective systems on AWS:

  1. Operational Excellence — run and monitor systems, continually improve
  2. Security — protect data, systems, and assets via IAM, encryption, monitoring
  3. Reliability — recover from failures, scale to meet demand, manage change
  4. Performance Efficiency — use resources efficiently, select right instance types
  5. Cost Optimization — avoid unnecessary costs, understand spending over time
  6. Sustainability (added 2021) — minimize environmental impact

The Well-Architected Tool in the AWS console performs reviews against these pillars and surfaces actionable recommendations.