Cloud  /  Google Cloud

GCP Google Cloud Platform 25 guides · updated 2026

Guides to BigQuery, Vertex AI, GKE, Dataflow, and the rest of Google's data- and AI-first cloud — written for engineers shipping real workloads.

BigQuery Interview Questions: Architecture, Optimisation, and Real-World Scenarios

These questions cover what interviewers actually ask when assessing BigQuery depth — not surface-level features but architecture decisions, cost trade-offs, and production problem-solving.


Architecture and Internals

Q: Explain BigQuery’s separation of storage and compute. Why does it matter?

BigQuery stores data in Google’s Colossus distributed file system using a columnar format called Capacitor. The compute layer — the Dremel query engine — is completely decoupled from that storage. When a query runs, BigQuery allocates slots (units of CPU + RAM) dynamically, executes the query, and releases those slots. No data is stored on worker nodes.

This matters because you can run enormous analytical queries without provisioning or managing a cluster. You also pay for compute (slots or bytes scanned) independently of storage, which reduces waste when data is hot for queries but cold for writes.

Q: What is the Dremel engine and how does it parallelize query execution?

Dremel is BigQuery’s distributed query engine, developed at Google and described in a 2010 paper. It uses a serving tree architecture: a root server receives the query, distributes sub-tasks to intermediate nodes, which fan out further to leaf nodes that actually read data from Colossus. Results aggregate back up the tree. This architecture can parallelize across thousands of nodes simultaneously, which is why a scan of 1 TB takes roughly the same wall-clock time as a scan of 1 GB — more data just means more leaves working in parallel.

Q: What is a BigQuery slot and how does the on-demand vs reservation model differ?

A slot is approximately one vCPU of compute. On-demand pricing gives you access to a shared slot pool and charges $5 per TB of data scanned, with no guaranteed throughput. Reservations let you purchase a fixed number of slots (or commit to flex or annual capacity) and assign them to projects or folders, giving predictable performance without per-query cost surprises. For organizations with large, steady query workloads, reservations are almost always cheaper than on-demand.


Partitioning vs Clustering

Q: When would you choose partitioning, and when would you add clustering on top?

Partitioning divides a table into physical segments — typically by date or integer range. BigQuery can prune entire partitions before executing a query. Use partitioning when your queries almost always filter on the partition column and when you want to reduce both cost (bytes scanned) and latency.

Clustering sorts data within each partition by one to four columns. BigQuery can skip blocks of rows that fall outside a filter. Add clustering when:

The two are complementary. Partition by date for time-based pruning, cluster by frequently filtered dimensions for row-level pruning within each partition.

Q: A query on a 5 TB partitioned table is still scanning the whole table. What are likely causes?

Common causes:

  1. The partition filter is applied after the scan — for example, filtering on a computed expression rather than the raw partition column
  2. The query uses _PARTITIONTIME as a string comparison rather than a TIMESTAMP comparison
  3. The partition filter is inside a subquery that BigQuery cannot push down
  4. The table is partitioned but the partition column is not actually used in the WHERE clause
  5. require_partition_filter is not enforced, so full scans are allowed without an error

Q: Can you cluster without partitioning?

Yes. You can apply clustering to an unpartitioned table. BigQuery will still organize data physically by the cluster columns and prune blocks during queries. However, without partitioning, you lose the ability to prune entire date ranges of data at the partition level.


Cost Control

Q: What does maximum_bytes_billed do and why would you use it?

Setting maximum_bytes_billed on a query cancels execution and returns an error if the estimated bytes to scan exceeds the threshold. It acts as a cost circuit breaker. This is especially useful in shared environments where a miswritten query with a cross join could trigger a multi-TB scan. You can set it at the query level or as a project default.

Q: How does long-term storage pricing work?

Any table (or table partition) that has not been modified for 90 consecutive days transitions automatically to long-term storage pricing, which is roughly 50% cheaper than active storage. This transition happens at the partition level — if you have a partitioned table and never update January 2023 partitions, those partitions become long-term storage while more recent partitions remain at active pricing. There is no performance difference between active and long-term storage.

Q: What is a materialized view and when does it help with cost?

A materialized view is a precomputed result set that BigQuery refreshes automatically when the base tables change (within defined bounds). When a query can be answered from the materialized view, BigQuery rewrites the query transparently. This reduces bytes scanned because instead of aggregating millions of raw rows on every dashboard refresh, you query a much smaller precomputed table. Materialized views are most effective for repetitive aggregation queries, such as daily rollups run hundreds of times by a dashboard.


Streaming vs Batch Ingest

Q: What are the trade-offs between streaming inserts and batch loads?

┌─────────────────────┬────────────────────────────┬──────────────────────────────┐
│ Dimension │ Streaming Inserts │ Batch Load │
├─────────────────────┼────────────────────────────┼──────────────────────────────┤
│ Latency │ < 1 second │ Minutes (async job) │
│ Cost │ $0.01 per 200 MB │ Free (up to limits) │
│ Delivery guarantee │ At-least-once │ Exactly-once (load job) │
│ Query availability │ Immediate │ After job completes │
│ Max row size │ 10 MB │ Depends on format │
│ Best for │ Real-time event ingestion │ Historical or nightly loads │
└─────────────────────┴────────────────────────────┴──────────────────────────────┘

Q: What is the Storage Write API and how does it differ from streaming inserts?

The Storage Write API is a newer, higher-throughput ingest path that supports three modes: buffered (committed in batches), committed (immediately visible), and pending (default buffered; flush atomically). Unlike the older streaming inserts API, the Storage Write API supports exactly-once semantics via stream offsets, enabling duplicate-free ingest at scale. It also costs less than streaming inserts for high-throughput workloads.


BigQuery ML

Q: Name three model types supported in BigQuery ML and a use case for each.

Q: Why would you use BigQuery ML instead of exporting data to Vertex AI or a local environment?

When your training data already lives in BigQuery and is measured in terabytes, exporting to another environment adds latency, transfer cost, and operational complexity. BigQuery ML trains directly against the data where it sits. For models that fit the supported types and where the training data is in BigQuery anyway, this eliminates an entire export-and-import step. For more complex architectures, custom frameworks, or very large model training, Vertex AI custom training is the better choice.


Real-World Scenarios

Q: A daily dashboard query is slow and costs $80 per run. Walk through your optimisation approach.

  1. Run EXPLAIN or check the query execution plan in the BigQuery console to find the expensive stages
  2. Verify the base table is partitioned on the date column used in the filter
  3. Check if the filter actually prunes partitions — look at the “Bytes processed” estimate before and after adding the filter
  4. Check if the table is clustered on other high-cardinality filter columns
  5. Consider a materialized view if the aggregation pattern is consistent
  6. Evaluate whether the dashboard can query a pre-aggregated summary table refreshed by a scheduled query, rather than hitting the raw fact table directly

Q: You need to load 500 GB of Parquet files from GCS into BigQuery every night. Design the process.

Use a batch load job, which is free. Schedule it with Cloud Composer or a Cloud Scheduler + Cloud Run trigger. Load into a staging table with a _PARTITIONTIME or ingestion-time partition, then use a MERGE or INSERT INTO… SELECT to move rows into the production table with deduplication logic. Write a load audit row to a log table on success. Configure a dead-letter path (alert via Cloud Monitoring) if the load job fails.


Summary

Strong BigQuery interview answers demonstrate three things: understanding of the distributed architecture (Dremel, Colossus, slots), judgment on cost and performance trade-offs (partitioning, clustering, on-demand vs reservations), and practical experience with real pipeline patterns (streaming vs batch, materialized views, BigQuery ML). The questions above cover all three dimensions and can be extended with hands-on scenario questions about specific datasets or business requirements.