Data Engineering  /  dbt

🔄 dbt — Data Build Tool 23 guides · updated 2026

Analytics engineering with SQL — models, tests, sources, and Jinja macros that turn raw warehouse tables into trustworthy, documented data products.

dbt Incremental Models

In data pipelines, especially in analytics, raw datasets often grow every day. If you naively rebuild your entire dataset (or model) from scratch every time you run your pipeline, you waste time, compute, and warehouse resources. That’s where incremental models in dbt come in: they allow you to process only the new or changed rows since the last run, rather than reprocessing the entire dataset.

In dbt, an incremental model is a variation of the table materialization where you include logic to append or merge new data, rather than rebuilding everything. It uses the built-in macro is_incremental() to detect when we should apply incremental logic. The first run is always a “full load” (i.e. build everything). Subsequent runs (if not forced full refresh) use incremental filters to only act upon delta data. ([dbt Developer Hub][1])

In this article I’ll cover:

  1. The fundamentals: how incremental models work under the hood
  2. Key configuration and strategies (unique_key, filtering, incremental_strategy)
  3. Three example programs (SQL model files) covering various use cases
  4. A Merquine-style diagram / flow showing lifecycle and dependencies
  5. Memory / exam / interview preparation techniques
  6. Why incremental models are vital in real projects
  7. Common pitfalls, best practices, and trade-offs

Let’s begin with internal mechanics.


1. How dbt Incremental Models Work

Lifecycle & Mechanics

When you designate a model as incremental, dbt alternates between two modes:

The is_incremental() macro returns true only under conditions:

  1. The target table (model) already exists in the warehouse
  2. The model is configured with materialized='incremental'
  3. You are not running with --full-refresh

If any of those is false, is_incremental() is false, and the model runs its “full logic” branch. ([dbt Developer Hub][1])

In your SQL, you typically write:

{{ config(materialized='incremental', unique_key='some_id') }}
select
...
from {{ ref('upstream_source') }}
{% if is_incremental() %}
where <filter_condition_using {{ this }}>
{% endif %}

Here, {{ this }} refers to the existing target table. Using that, you can compare timestamps or IDs. ([dbt Developer Hub][2])

Key Ingredients: Filtering + Unique Keys

To make incremental logic safe and correct, you need:

  1. A filter condition in the is_incremental() branch that isolates new or updated rows—often by comparing a timestamp or version column to the maximum in existing table.
  2. A unique_key configuration (optional but highly recommended) that tells dbt how to identify duplicates or updates (so you can do merging or upserts if needed). ([dbt Developer Hub][1])
  3. Optionally, you may choose different incremental strategies (e.g. merge, append, insert_overwrite, microbatch) depending on your warehouse capabilities. ([dbt Developer Hub][3])

The incremental strategy influences how dbt applies the delta data: e.g. some warehouses support atomic MERGE statements; others implement a DELETE + INSERT. ([dbt Developer Hub][4])

Summary of behavior

Incremental models thus provide a balance: parts of your data are computed persistently (like a table), but you avoid wasteful full recomputation each run.


2. Configuration & Strategies

Basic Configuration

In your model file, you configure like this:

{{ config(
materialized = 'incremental',
unique_key = 'id'
) }}
select ...
from upstream
{% if is_incremental() %}
where <filter logic>
{% endif %}

You must ensure the SQL is valid both with and without the is_incremental() clause. That is, the “base query” (outside the if) must work even when executed as a full load.

Filtering logic patterns

Common filtering techniques:

You must ensure there are no gaps or missing updates — your filter should be inclusive enough to cover late arrivals or small backfills.

Unique key & merge logic

Incremental strategies

Some strategies available in dbt:

Which you choose depends on:

Full-refresh and schema changes


3. Example Programs (SQL Models) for Incremental Patterns

Below are 3 unique example scenarios, each with example SQL model files illustrating different use cases of incremental logic.

Example Set 1: Simple event ingestion (append-only)

Example 1A: incremental_events_append.sql

{{ config(
materialized = 'incremental',
unique_key = 'event_id',
incremental_strategy = 'append'
) }}
select
event_id,
user_id,
event_type,
event_ts
from {{ ref('raw_events') }}
{% if is_incremental() %}
where event_ts > (select coalesce(max(event_ts), '1900-01-01') from {{ this }})
{% endif %}

Example 1B: incremental_page_views.sql

{{ config(
materialized = 'incremental',
unique_key = 'view_id'
) }}
with clean as (
select
view_id,
user_id,
view_ts,
page_url
from {{ ref('raw_page_views') }}
)
select * from clean
{% if is_incremental() %}
where view_ts > (select max(view_ts) from {{ this }})
{% endif %}

Example 1C: incremental_logs for JSON parsing

{{ config(
materialized = 'incremental',
unique_key = 'log_id'
) }}
with parsed as (
select
(value_json->>'log_id')::int as log_id,
(value_json->>'user') as username,
(value_json->>'action') as action,
ingest_ts
from {{ ref('raw_logs') }},
lateral flatten(input => raw_logs.json_value) as value_json
)
select * from parsed
{% if is_incremental() %}
where ingest_ts > (select max(ingest_ts) from {{ this }})
{% endif %}

Example Set 2: Merging dimension updates (updates + inserts)

Example 2A: incremental_customers_merge.sql

{{ config(
materialized = 'incremental',
unique_key = 'customer_id',
incremental_strategy = 'merge'
) }}
with base as (
select
customer_id,
name,
email,
updated_at
from {{ ref('raw_customers') }}
)
select * from base
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Example 2B: incremental_products_merge_updates.sql

{{ config(
materialized = 'incremental',
unique_key = 'product_id',
incremental_strategy = 'merge'
) }}
with prod as (
select
product_id,
category,
price,
last_modified
from {{ ref('raw_products') }}
)
select * from prod
{% if is_incremental() %}
where last_modified > (select max(last_modified) from {{ this }})
{% endif %}

Example 2C: incremental_orders_merge.sql

{{ config(
materialized = 'incremental',
unique_key = 'order_id',
incremental_strategy = 'merge'
) }}
with ord as (
select
order_id,
customer_id,
status,
total,
updated_at
from {{ ref('raw_orders') }}
)
select * from ord
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Example Set 3: Partitioned / time window incremental / microbatch

Example 3A: incremental_sales_partitioned.sql

{{ config(
materialized = 'incremental',
unique_key = 'sale_id',
partition_by = {'field':'sale_date', 'data_type':'date'}
) }}
select
sale_id,
product_id,
sale_date,
amount
from {{ ref('raw_sales') }}
{% if is_incremental() %}
where sale_date > (select max(sale_date) from {{ this }})
{% endif %}

Example 3B: incremental_events_windowed.sql

{{ config(
materialized = 'incremental',
unique_key = 'event_id',
incremental_strategy = 'merge'
) }}
with events as (
select
event_id,
user_id,
event_time,
properties
from {{ ref('raw_events') }}
)
select * from events
{% if is_incremental() %}
where event_time >= (select dateadd(day, -1, max(event_time)) from {{ this }})
{% endif %}

Example 3C: microbatch incremental (if supported)

{{ config(
materialized = 'incremental',
unique_key = 'session_id',
incremental_strategy = 'microbatch',
incremental_predicates = ["session_start >= timestampadd(day, -7, current_timestamp)"]
) }}
select
session_id,
user_id,
session_start,
session_end
from {{ ref('raw_sessions') }}
{% if is_incremental() %}
where session_start > (select max(session_start) from {{ this }})
{% endif %}

4. Merquine-Style Flow / Diagram for Incremental Models

Here’s a conceptual flow describing how incremental models fit in dbt’s pipeline:

Image

Image

Image

Textual representation:

raw / source tables
dbt compile (resolves SQL, macros, config)
for each incremental model:
├─ check: does target table exist?
│ │
│ └─ No → full load (build entire table)
├─ else if full-refresh flag? → full load
└─ else:
▸ is_incremental() = true
▸ filter source for new / changed rows
▸ apply incremental strategy (merge / append / microbatch)
▸ upsert / insert into existing table
execute incremental SQL in warehouse
new rows inserted / updated
full lineage graph, tests, documentation, next models run

In a larger DAG, incremental models sit between downstream models: the next models refer to them via ref(), so they benefit from fresh but efficient incremental updates.


5. Memory / Interview / Exam Preparation Tips

Here are strategies to remember, explain, and master incremental models.

Mnemonics & Key Terms

Flashcards / Q&A

Whiteboard / verbal explanation

Interview narrative

In an interview, you could answer:

“In dbt, an incremental model lets you process only the delta data rather than rebuilding the entire table. You use {{ config(materialized='incremental') }} and wrap logic inside if is_incremental() to filter new or updated rows. You also configure a unique_key and optionally an incremental strategy like merge to safely upsert. If the target table doesn’t exist or you use --full-refresh, dbt falls back to full load. Incremental models are essential for scaling pipelines, reducing compute cost, and speeding deployments.”

Be prepared to discuss trade-offs: what happens with schema changes, how to catch late-arriving data, deduplication, etc.


6. Why Incremental Models Are Important (Real-World Value)

Incremental models matter not just as a theoretical feature — they are essential in production data systems. Here’s why:

  1. Performance and speed When datasets grow to millions or billions of rows, rebuilding everything takes too long. Incremental models let you run fast daily (or hourly) updates by processing small deltas. ([dbt Developer Hub][4])

  2. Cost efficiency Cloud warehouses charge by compute usage. Running full-table transformations repeatedly is costly. Incremental logic reduces compute consumption dramatically. ([dbt Developer Hub][1])

  3. Scalability As your project and number of models grow, you’ll need efficient pipelines. Incremental models support growth by avoiding frequent large-scale rebuilds.

  4. Timeliness & freshness You can update your downstream analytics more frequently because incremental runs are faster, enabling near real-time insights.

  5. Flexibility & control You decide how much data to process, which updates to include, how to merge, how to handle late arrivals, schema changes, etc.

  6. Reduced risk Because incremental runs are smaller, errors or performance regressions surface faster; you’re less likely to crash entire pipeline builds.

In sum, incremental models enable real-world data engineering at scale, balancing correctness, speed, cost, and maintainability.


7. Pitfalls, Trade-offs, & Best Practices

No tool is magic — here are common pitfalls and how to mitigate them.

Pitfalls & Issues

PitfallDescriptionMitigation / Best Practice
Incomplete filtering logicIf your where clause misses some records (e.g., late arrivals), you may lose updates or data gapsAdd buffer (e.g. subtract a day), or reprocess overlapping window
No unique_keyWithout unique key, duplicates may accumulateAlways supply a reliable unique_key when updates are possible
Schema changes not reflectedIf upstream adds columns or changes types, incremental logic may not pick theseUse --full-refresh or config on_schema_change when supported
Overlapping runs causing duplicatesIf runs overlap in timing, duplicate or conflicting rows may ariseUse transactional merge or careful window logic
Performance degrade over timeTable size grows big; incremental insert or merge slowsPeriodically vacuum, compact, or do full rebuilds
Testing incremental logic is harderBecause only small delta runs, mistakes in filter logic may go unnoticedUse dbt compile to inspect generated SQL; test with small datasets or run full-refresh to validate
Adapter limitationsSome warehouses don’t fully support merge, or incremental in partitioned tablesChoose strategy supported by your adapter; consult docs for merge, insert_overwrite, etc. ([dbt Developer Hub][3])

Best practices


8. Summary & Next Steps