Cloud  /  Google Cloud

GCP Google Cloud Platform 25 guides · updated 2026

Guides to BigQuery, Vertex AI, GKE, Dataflow, and the rest of Google's data- and AI-first cloud — written for engineers shipping real workloads.

Google Compute Engine: VMs You Control in a Global Infrastructure

Compute Engine is Google Cloud’s Infrastructure-as-a-Service offering. You get virtual machines that run on Google’s hardware, in Google’s data centers, connected by Google’s network — but you control the operating system, software stack, and configuration. Unlike managed services where Google abstracts away the underlying infrastructure, Compute Engine hands you a server.

This matters for workloads that need specific operating system configurations, software licenses tied to specific hardware, or the ability to run code that wasn’t designed for containers or serverless environments.


Machine Families: Picking the Right CPU/Memory Ratio

Compute Engine organizes VM configurations into machine families, each optimized for different workload profiles.

┌──────────────────────────────────────────────────────────────────┐
│ Machine Family Overview │
├──────────────┬───────────────────────────────────────────────────┤
│ E2 │ Cost-optimized general purpose. Up to 32 vCPU. │
│ │ Good for web servers, dev environments, low- │
│ │ traffic applications. │
├──────────────┼───────────────────────────────────────────────────┤
│ N2 / N2D │ Balanced general purpose. Intel or AMD. │
│ │ Higher sustained performance than E2. │
│ │ Good for databases, business applications. │
├──────────────┼───────────────────────────────────────────────────┤
│ C2 / C3 │ Compute-optimized. High clock speed, low latency. │
│ │ HPC, game servers, CPU-intensive batch jobs. │
├──────────────┼───────────────────────────────────────────────────┤
│ M2 / M3 │ Memory-optimized. Up to 12 TB RAM. │
│ │ In-memory databases, SAP HANA, large analytics. │
├──────────────┼───────────────────────────────────────────────────┤
│ A2 / A3 │ Accelerator-optimized. NVIDIA GPUs attached. │
│ │ ML training, inference, HPC with GPU. │
└──────────────┴───────────────────────────────────────────────────┘

Within each family you can choose predefined sizes (n2-standard-4 gives 4 vCPU and 16 GB RAM) or create a custom machine type specifying exact vCPU and memory counts.


Custom Machine Types: Breaking the Predefined Mold

One of Compute Engine’s differentiators vs AWS EC2 is the ability to specify any combination of vCPU count and memory within certain bounds. This avoids the common situation where you need 6 vCPUs but the next available predefined size gives you 8, with 2 going to waste.

Terminal window
# Create a VM with exactly 6 vCPUs and 20 GB RAM
gcloud compute instances create my-custom-vm \
--custom-cpu=6 \
--custom-memory=20480MB \
--zone=us-central1-a \
--image-family=debian-12 \
--image-project=debian-cloud

Custom VMs are available in the N2, N2D, E2, and N1 families. Memory increments must be in multiples of 256 MB.


Preemptible and Spot VMs: Trading Availability for Cost

Standard VMs are available until you stop them. Preemptible (older model) and Spot (current model) VMs are the opposite: Google can reclaim them with a 30-second warning whenever it needs the capacity back. In exchange, they cost 60-91% less than equivalent standard VMs.

Cost comparison (N2 standard-4, us-central1):
Standard: ~$0.19 / hour
Spot: ~$0.04 / hour
Savings: ~79%

The workloads that suit Spot VMs well:

The workloads that do not suit Spot VMs:

Terminal window
gcloud compute instances create spot-worker \
--machine-type=n2-standard-4 \
--provisioning-model=SPOT \
--instance-termination-action=STOP \
--zone=us-central1-a \
--image-family=debian-12 \
--image-project=debian-cloud

Persistent Disks: Storage That Survives VM Restarts

A VM’s local disk is ephemeral by default — stop the instance and the data is gone. Persistent Disks (PDs) attach to a VM and persist independently of the VM lifecycle. Stop the VM, detach the disk, attach it to another VM: the data is still there.

VM lifecycle vs disk lifecycle:
VM: RUNNING ──► STOPPED ──► DELETED
PD: ─────────────────────────────── still exists
VM lifecycle vs local SSD:
VM: RUNNING ──► STOPPED ──► DELETED
Local SSD: ─────────────────────── wiped

Disk types by performance tier:

TypeIOPS / GBBest for
pd-standard (HDD)LowCold storage, batch reads
pd-balancedMediumGeneral purpose, OS disks
pd-ssdHighDatabases, transactional workloads
pd-extremeVery highSAP HANA, Oracle, high-frequency trading
Hyperdisk ExtremeConfigurable up to 350,000 IOPSLatency-sensitive enterprise workloads

Managed Instance Groups: Scaling and Self-Healing VMs

A Managed Instance Group (MIG) creates and manages a collection of identical VMs from an instance template. If a VM fails its health check, the MIG replaces it automatically. If traffic increases, the MIG scales out by adding VMs.

Load Balancer
┌───────┴────────┐
│ Managed │
│ Instance Group │
│ ┌──────────┐ │
│ │ VM - 1 │ │ ← health check fails, MIG replaces
│ ├──────────┤ │
│ │ VM - 2 │ │
│ ├──────────┤ │
│ │ VM - 3 │ │ ← autoscaler adds this when CPU > threshold
│ └──────────┘ │
└────────────────┘

Setting up a MIG requires an instance template, then a group definition:

Terminal window
# Create the template
gcloud compute instance-templates create web-template \
--machine-type=n2-standard-2 \
--image-family=debian-12 \
--image-project=debian-cloud \
--boot-disk-size=50GB \
--tags=http-server
# Create the managed instance group
gcloud compute instance-groups managed create web-mig \
--template=web-template \
--size=3 \
--zone=us-central1-a
# Add autoscaling
gcloud compute instance-groups managed set-autoscaling web-mig \
--zone=us-central1-a \
--min-num-replicas=2 \
--max-num-replicas=10 \
--target-cpu-utilization=0.6

Live Migration: Maintenance Without Downtime

When Google needs to perform hardware maintenance on the physical host running your VM, Compute Engine live-migrates the VM to another host transparently. Your VM keeps running. This contrasts with cloud providers that simply reboot VMs during maintenance windows.

Live migration is on by default for standard VMs. VMs using GPU accelerators or local SSDs cannot be live-migrated and instead get a short notification before termination.


Networking: VPCs, Subnets, and Firewall Rules

Every VM runs inside a VPC (Virtual Private Cloud) and sits in a subnet within a specific region. The default VPC auto-creates subnets in every region, which is convenient for getting started but less secure for production deployments.

Terminal window
# Create a custom VPC with a specific subnet
gcloud compute networks create production-vpc \
--subnet-mode=custom
gcloud compute networks subnets create app-subnet \
--network=production-vpc \
--range=10.10.0.0/24 \
--region=us-central1
# Create VM in that subnet
gcloud compute instances create app-server \
--machine-type=n2-standard-4 \
--subnet=app-subnet \
--zone=us-central1-a \
--no-address \
--image-family=debian-12 \
--image-project=debian-cloud

--no-address creates the VM with no external (public) IP. It is reachable only from inside the VPC or via Cloud NAT for outbound connections.

Firewall rules control traffic by source/destination tags, IP ranges, and protocols:

Terminal window
# Allow HTTPS only from the load balancer
gcloud compute firewall-rules create allow-https \
--network=production-vpc \
--allow=tcp:443 \
--source-ranges=130.211.0.0/22,35.191.0.0/16 \
--target-tags=https-server

Real-World Use Case: Three-Tier Web Application

A typical deployment pattern using Compute Engine:

Internet
Cloud Load Balancer (global)
MIG: web-tier (n2-standard-2, 3-10 VMs)
Handles HTTP: renders pages, calls backend API
│ (internal VPC traffic only)
MIG: app-tier (n2-standard-4, 2-6 VMs)
Business logic, session management
│ (internal VPC traffic only)
Cloud SQL (managed PostgreSQL)
or VM: database-tier (n2-highmem-8, single VM with pd-extreme)

The key pattern: each tier has no external IP, firewall rules restrict traffic to only the required port and source, and MIGs handle availability and scaling automatically.


Summary

Compute Engine gives you genuine infrastructure control — the same VMs running in Google’s global network, with the flexibility to choose machine types down to exact vCPU and memory counts, pick disk types by IOPS requirements, and control networking at a subnet and firewall-rule level. Managed Instance Groups abstract the operational work of health checking and scaling. Spot VMs reduce costs dramatically for fault-tolerant workloads. The key skill is matching the machine family, disk type, and networking model to what the workload actually needs rather than defaulting to the largest available size.