Google Cloud Spanner: Globally Distributed Relational Database With External Consistency
Every other horizontally scalable database asks you to give something up — either strong consistency, SQL semantics, or geographic distribution. Cloud Spanner gives up none of them. It runs ACID transactions across nodes in multiple regions and globally, guarantees external consistency (stronger than serializable), and speaks standard SQL.
The reason this is hard to build is that distributed transactions require coordinating clocks across machines, and atomic clocks in commodity hardware are not reliable enough. Google solved this with TrueTime. Understanding TrueTime is the key to understanding how Spanner works.
TrueTime: The Clock That Makes Spanner Possible
Normal computer clocks drift. Two servers in different data centers might disagree on the current time by tens of milliseconds. In a distributed database, using those clocks to order transactions leads to anomalies — transaction B that “happened after” transaction A might actually appear to have a lower timestamp.
Google’s TrueTime API exposes the current time not as a single value but as an interval: [earliest, latest]. The width of that interval reflects the maximum possible clock uncertainty, which Google keeps under 7 milliseconds by using GPS receivers and atomic clocks in each data center.
Spanner uses TrueTime to assign commit timestamps. When a transaction commits, Spanner waits until it is certain that no other transaction committed at the same time — essentially waiting out the uncertainty window before returning success. This wait is short (typically < 7 ms) and guarantees that if you observe transaction B’s result, transaction B definitely committed after any transaction you observed previously.
Transaction A commits at TrueTime [T-3ms, T+3ms]Transaction B commits at TrueTime [T+5ms, T+11ms]
Spanner is certain: A committed before B.No ambiguity, no anomalies.External consistency guaranteed.Architecture: Splits, Paxos, and Read/Write Paths
Spanner automatically shards data into chunks called splits. Each split is replicated across Paxos groups — typically three or five replicas, each in a different zone or region. Writes require a majority vote (quorum) among replicas.
Client │ ▼Cloud Spanner front-end (routes to correct shard) │ ▼Paxos group for that split: ├── Replica A (us-east1-b) ← leader: coordinates write ├── Replica B (us-east1-c) ← follower └── Replica C (us-east1-d) ← follower
Write operation: Leader proposes ──► 2 of 3 replicas vote YES ──► commit TrueTime timestamp assigned ──► response to clientReads can go to any replica (even followers for stale reads) or only to the leader (for strongly consistent reads at a specific timestamp).
Instance Configurations: Regional vs Multi-Region
When you create a Spanner instance, you choose a configuration that determines where replicas are placed.
Regional configurations (e.g., us-central1): Three read-write replicas, all in the same region, in different zones. Low write latency (3-7 ms for a round-trip within a region). Suitable for applications where all users are geographically close to that region.
Multi-region configurations (e.g., nam6 — North America 6-replica): Read-write replicas across multiple regions. Writes must achieve quorum across regions, adding latency (50-100 ms for cross-continental). Suitable for applications where data must survive a regional outage, or where users span multiple continents.
nam6 configuration: Northern Virginia (primary region, 2 read-write replicas) Iowa (primary region, 2 read-write replicas) South Carolina (witness, 2 read-only replicas)
A write must reach quorum across primary regions. If Northern Virginia goes down: Iowa still has 2 R/W replicas → quorum achievable → database stays upCreating an Instance and Schema
# Create a regional Spanner instancegcloud spanner instances create prod-spanner \ --config=us-central1 \ --description="Production Spanner Instance" \ --processing-units=100
# Create a databasegcloud spanner databases create inventory-db \ --instance=prod-spanner
# Create tables with DDLgcloud spanner databases ddl update inventory-db \ --instance=prod-spanner \ --ddl=' CREATE TABLE Products ( ProductId STRING(36) NOT NULL, Name STRING(256) NOT NULL, Category STRING(64), StockCount INT64, UpdatedAt TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp=true), ) PRIMARY KEY (ProductId);
CREATE INDEX ProductsByCategory ON Products (Category); 'allow_commit_timestamp=true lets Spanner populate that column automatically with the commit timestamp of each write — useful for auditing and change tracking.
Interleaved Tables: Colocation for Performance
When a parent-child relationship exists between tables, Spanner can interleave child rows physically adjacent to their parent rows. A query that joins a parent and its children then becomes a local read rather than a distributed lookup.
WITHOUT interleaving: Products table ──── stored anywhere OrderItems table ─── stored anywhere JOIN = distributed lookup for each row
WITH interleaving: ProductId: "PROD-001" ├── Products row: { Name: "Widget", Stock: 100 } ├── OrderItem row: { OrderId: "O-1", Qty: 2 } └── OrderItem row: { OrderId: "O-2", Qty: 5 } (all stored on the same split, same machine)CREATE TABLE OrderItems ( ProductId STRING(36) NOT NULL, OrderId STRING(36) NOT NULL, Quantity INT64 NOT NULL,) PRIMARY KEY (ProductId, OrderId), INTERLEAVE IN PARENT Products ON DELETE CASCADE;Reading and Writing with the Python Client
from google.cloud import spannerfrom google.cloud.spanner_v1 import param_types
client = spanner.Client()instance = client.instance("prod-spanner")database = instance.database("inventory-db")
# Strong read (external consistency guaranteed)with database.snapshot() as snapshot: results = snapshot.execute_sql( "SELECT ProductId, Name, StockCount FROM Products WHERE Category = @cat", params={"cat": "Electronics"}, param_types={"cat": param_types.STRING}, ) for row in results: print(f"{row[0]}: {row[1]} (stock: {row[2]})")
# Read-write transaction with retry handlingdef deduct_inventory(transaction, product_id, quantity): row = transaction.execute_sql( "SELECT StockCount FROM Products WHERE ProductId = @pid", params={"pid": product_id}, param_types={"pid": param_types.STRING}, ).one()
if row[0] < quantity: raise ValueError("Insufficient stock")
transaction.execute_update( "UPDATE Products SET StockCount = StockCount - @qty " "WHERE ProductId = @pid", params={"pid": product_id, "qty": quantity}, param_types={"pid": param_types.STRING, "qty": param_types.INT64}, )
database.run_in_transaction(deduct_inventory, "PROD-001", 5)run_in_transaction automatically retries the transaction if Spanner returns an ABORTED error due to lock contention. This retry is safe because the transaction function is idempotent within its execution context.
Processing Units: Scaling Compute
Spanner is billed in processing units (PU). 1,000 PU = 1 node. You can start at 100 PU for small databases and scale in 100 PU increments.
100 PU ── dev/test, small production apps1,000 PU ── 2,000 QPS reads, ~1,800 QPS writes (varies with query complexity)3,000 PU ── high-volume OLTP
Each 1,000 PU provides approximately: 2 TB storage 2,000 reads/second (simple point reads) 1,800 writes/secondStorage is billed separately from processing units. There is no separate storage limit — you pay for what you store.
Spanner vs Cloud SQL: When to Choose Which
┌────────────────────────────────┬───────────────────┬──────────────────────┐│ Requirement │ Cloud SQL │ Cloud Spanner │├────────────────────────────────┼───────────────────┼──────────────────────┤│ Standard relational model │ Yes │ Yes (SQL 2011) ││ Global distribution │ No (regional only) │ Yes ││ Horizontal write scaling │ No (single primary)│ Yes (automatic) ││ > 10 TB data │ Up to 64 TB │ Unlimited ││ Sub-millisecond read latency │ Yes (local) │ ~1ms (regional) ││ Cost at small scale │ Much lower │ Higher ││ Existing MySQL/PostgreSQL code │ Easy migration │ Requires DDL changes │└────────────────────────────────┴───────────────────┴──────────────────────┘Spanner makes sense when you need global consistency and multi-region writes, or when your dataset and write rate exceed what a single Cloud SQL primary can handle. For most applications — even large ones — Cloud SQL covers the requirements at a fraction of the cost.
Summary
Cloud Spanner’s core value is the combination it uniquely offers: horizontal write scaling, SQL semantics, and external consistency. TrueTime makes that consistency possible without sacrificing availability during network partitions. Instance configurations determine whether you pay for regional low-latency or multi-region durability. Interleaved tables co-locate related data to minimize distributed reads. The Python client’s run_in_transaction handles Spanner’s optimistic concurrency retries transparently. The trade-off is cost — Spanner is significantly more expensive than Cloud SQL, so it belongs in use cases where the scale or global distribution requirements genuinely demand it.