Google BigQuery: Serverless SQL Analytics Across Terabytes in Seconds
A financial firm needed to run fraud detection queries against 500 million daily transactions. Their on-premises data warehouse took 45 minutes per query. After migrating to BigQuery, the same queries ran in under 30 seconds — with zero infrastructure to manage. That is what serverless analytics looks like in practice.
Storage and Compute Are Separate
BigQuery’s most important design decision is the decoupling of storage from compute. Traditional databases tie the two together — if you want faster queries, you also pay for more disk. BigQuery breaks that link entirely.
+-------------------------------------------------------------+| BigQuery Architecture |+--------------------+----------------------------------------+| Query Layer | Dremel engine distributes SQL across || (Slots) | thousands of workers in parallel |+--------------------+----------------------------------------+| Shuffle Layer | Jupiter network moves intermediate || | data between worker stages |+--------------------+----------------------------------------+| Storage Layer | Colossus distributed file system || (Capacitor) | stores data in columnar format |+-------------------------------------------------------------+Storage is charged separately at 6.25 per TB scanned) or reservation-based (flat-rate slots). You pay for what you query, not what you store nearby.
Slots: The Unit of Compute
A slot is a virtual CPU unit in Dremel. BigQuery allocates slots to your queries automatically in on-demand mode. With reservations, you purchase a fixed number of slots and assign them to projects.
- On-demand accounts get up to 2,000 slots per project by default
- Each TB of data scanned uses roughly 1–2 slots for a second
- Heavy analytical queries can fan out across hundreds of slots simultaneously
For predictable workloads running all day, reservations typically cost less than on-demand. For occasional heavy queries, on-demand is almost always cheaper.
Partitioning: Pruning Scanned Data
Without partitioning, BigQuery scans every row in a table for every query. With partitioning, it skips entire date shards or integer ranges when the WHERE clause filters on the partition column.
Table: sales_events (partitioned by event_date)
Partition 2025-01-01 | Partition 2025-01-02 | Partition 2025-01-03----------------------|------------------------|----------------------row 1... | row 1... | row 1...row 2... | row 2... | row 2...
Query: WHERE event_date = '2025-01-02' --> Only the middle partition is read --> Other partitions are skipped entirelyYou can partition on:
- A DATE or TIMESTAMP column (ingestion time is the default)
- An INTEGER RANGE column (useful for user IDs, shard keys)
Combine partitioning with a WHERE clause on that column in every production query. Without it, partition pruning does not happen.
Clustering: Sorting Within Partitions
Clustering physically sorts data within each partition by one to four columns. When a query filters on clustered columns, BigQuery reads only the relevant sorted blocks rather than the full partition.
Partition 2025-01-02 (clustered by region, product_category)
Block 1: region=APAC, product_category=ElectronicsBlock 2: region=APAC, product_category=ApparelBlock 3: region=EMEA, product_category=ElectronicsBlock 4: region=US, product_category=Apparel
Query: WHERE event_date='2025-01-02' AND region='EMEA' --> Only Block 3 is readClustering is free — it costs nothing extra to create. The benefit compounds with high-cardinality columns you filter frequently.
Ingesting Data: Streaming vs Batch
Batch load jobs are free. You load data from GCS files (CSV, Avro, Parquet, JSON, ORC) using a load job. Data is available for queries after the job completes. Ideal for daily ETL pipelines.
Streaming inserts push individual rows in real time at $0.01 per 200 MB. Rows appear queryable in seconds. Useful for dashboards that need up-to-the-minute data. The trade-off is cost and at-least-once delivery semantics — deduplication requires a unique insertId.
Storage Write API (the modern alternative to streaming) supports exactly-once delivery with a committed offset model, and costs about half of the classic streaming API.
BigQuery ML
BigQuery ML lets you train machine learning models with SQL — no Python, no Spark, no export needed.
CREATE OR REPLACE MODEL sales.churn_modelOPTIONS (model_type = 'logistic_reg', input_label_cols = ['churned'])ASSELECT account_age_days, monthly_spend, support_tickets_last_90d, churnedFROM sales.customer_featuresWHERE event_date < '2025-01-01';After training, run ML.PREDICT inside BigQuery to score your existing tables. Supported model types include linear regression, logistic regression, k-means, matrix factorisation, ARIMA time series, and boosted trees.
Cost Control in Practice
BigQuery costs surprise teams that do not set guardrails early. Three practical controls:
- Column selection — never use
SELECT *on wide tables. Each extra column scanned adds to the bill. - Partitioned table expiry — set
partition_expiration_msso old partitions delete automatically. - Custom quotas — in the BigQuery console, set per-day and per-query byte limits per project or per user.
Query cost estimate pattern:- Before run: check "This query will process X GB" in the console- After run: check job details for actual bytes billed- Slot usage: check INFORMATION_SCHEMA.JOBS_BY_PROJECTMaterialized views pre-compute expensive aggregations and auto-refresh when base data changes. They are one of the fastest ways to reduce recurring query costs.
Real-World Scenario: E-Commerce Analytics Pipeline
An e-commerce platform ingests clickstream data from 10 million daily sessions. Their BigQuery setup:
- Raw events land in GCS every 5 minutes as Parquet files
- Load jobs push files to a raw table partitioned by
event_date, clustered byuser_countryandevent_type - A scheduled query runs nightly to aggregate to a summary table
- Looker Studio connects to the summary table for executive dashboards
Total monthly BigQuery cost for 2 TB of active data and 50 TB queried: roughly 4,200 for an equivalent reserved data warehouse cluster.
Summary
| Feature | What It Does |
|---|---|
| Slots | Units of compute; auto-allocated or reserved |
| Partitioning | Skips full date/range shards on filtered queries |
| Clustering | Sorts data blocks within partitions for faster reads |
| Streaming ingest | Sub-second data availability; costs more |
| Storage Write API | Exactly-once, cheaper alternative to streaming |
| BigQuery ML | Train and predict with SQL, no data export |
BigQuery’s serverless model means you do not tune JVM heap sizes, manage cluster nodes, or pick instance types. You design good table schemas, use partitioning and clustering correctly, and write queries that reference only the columns they need. Everything else is Google’s problem.