Data Engineering  /  dbt

🔄 dbt — Data Build Tool 23 guides · updated 2026

Analytics engineering with SQL — models, tests, sources, and Jinja macros that turn raw warehouse tables into trustworthy, documented data products.

dbt Ephemeral Models

In dbt (data build tool), ephemeral models are a powerful concept for creating reusable, temporary SQL transformations that do not generate tables or views in your data warehouse. Instead, they are inlined as Common Table Expressions (CTEs) into downstream models.

Ephemeral models allow you to modularize logic, improve readability, and keep your warehouse clean without creating unnecessary physical objects. They are especially useful for small, intermediate transformations used in multiple downstream models.

In this article, we will explore:

  1. What ephemeral models are and how they work
  2. Key characteristics and rules
  3. Three detailed examples with SQL code
  4. A Merquine-style diagram showing ephemeral model execution
  5. Memory techniques for interviews and exams
  6. Why ephemeral models are important
  7. Common pitfalls and best practices

1. What Are Ephemeral Models?

Conceptual Definition

{{ config(materialized='ephemeral') }}
select
user_id,
lower(email) as email_clean
from {{ ref('raw_users') }}

If another model references this ephemeral model, dbt will insert the SQL above as a CTE in that model, rather than creating a separate table or view.


1.1 Key Characteristics

FeatureExplanation
No storageEphemeral models do not consume warehouse storage.
Inline logicSQL is inserted into downstream models at compile time.
ReusableLogic can be modularized and reused in multiple models.
Fast iterationIdeal for temporary transformations or helper logic.
Cannot be queried directlySince no table/view exists, you cannot run SELECT * FROM ephemeral_model.
No incremental supportis_incremental() always returns false.

1.2 Why Use Ephemeral Models?

  1. Reduce warehouse clutter: Avoid unnecessary tables and views.
  2. Encourage modularity: Break complex SQL into smaller, reusable blocks.
  3. Improve maintainability: Centralize transformations like cleaning, formatting, or filtering.
  4. Optimize query compilation: Complex transformations can be nested directly into downstream logic without intermediate storage.

2. Ephemeral Models vs Other Materializations

MaterializationStorageExecutionUse Case
tableYesPrecomputedLarge datasets, high query performance
viewNoRecomputed on readLightweight, always fresh
incrementalYesNew/updated rowsGrowing datasets, cost-efficient
ephemeralNoInlined CTETemporary logic, helper transformations

Key difference: ephemeral models are never stored, only inlined, making them ideal for intermediate transformations.


3. How Ephemeral Models Work

  1. Write the ephemeral model

    {{ config(materialized='ephemeral') }}
    select user_id, lower(email) as email_clean from {{ ref('raw_users') }}
  2. Reference it in another model

    {{ config(materialized='table') }}
    with cleaned_users as (
    select * from {{ ref('ephemeral_clean_users') }}
    )
    select
    u.user_id,
    u.email_clean,
    o.order_id
    from cleaned_users u
    join {{ ref('raw_orders') }} o on u.user_id = o.customer_id
  3. dbt compile: The SQL from the ephemeral model is inserted into the CTE cleaned_users.

  4. dbt run: Downstream model executes the SQL, including inlined ephemeral logic.

Benefit: You never create an extra table, but downstream models can still reuse the logic.


4. Example Programs for Ephemeral Models

Below are three unique example scenarios showing ephemeral model usage.

Example Set 1: Cleaning User Data

ephemeral_clean_users.sql

{{ config(materialized='ephemeral') }}
select
id as user_id,
lower(trim(email)) as email_clean,
coalesce(first_name, 'unknown') as first_name_clean,
coalesce(last_name, 'unknown') as last_name_clean
from {{ ref('raw_users') }}
where active = true

downstream model using it:

{{ config(materialized='table') }}
with cleaned_users as (
select * from {{ ref('ephemeral_clean_users') }}
)
select
u.user_id,
u.email_clean,
count(o.order_id) as order_count
from cleaned_users u
join {{ ref('raw_orders') }} o on u.user_id = o.customer_id
group by 1, 2

Explanation:


Example Set 2: Calculating Metrics for Reuse

ephemeral_user_metrics.sql

{{ config(materialized='ephemeral') }}
select
user_id,
sum(amount) as total_spent,
count(distinct order_id) as orders_count
from {{ ref('raw_orders') }}
group by user_id

downstream model:

{{ config(materialized='table') }}
with metrics as (
select * from {{ ref('ephemeral_user_metrics') }}
)
select
u.id as user_id,
u.signup_date,
m.total_spent,
m.orders_count
from {{ ref('raw_users') }} u
left join metrics m on u.id = m.user_id

Benefit:


Example Set 3: Filtering & Transforming Event Data

ephemeral_filtered_events.sql

{{ config(materialized='ephemeral') }}
select
event_id,
user_id,
event_type,
event_ts
from {{ ref('raw_events') }}
where event_type in ('purchase', 'signup')

downstream model:

{{ config(materialized='table') }}
with filtered_events as (
select * from {{ ref('ephemeral_filtered_events') }}
)
select
e.user_id,
count(*) as event_count,
max(event_ts) as last_event
from filtered_events e
group by e.user_id

Explanation:


5. Merquine-Style Diagram for Ephemeral Models

raw tables
ephemeral models (inlined SQL)
downstream models (table/view/incremental)
BI dashboards / analytics

6. Memory & Interview/Exam Tips

Mnemonics

Key Points to Remember

  1. Ephemeral = inlined CTE
  2. Cannot be queried directly
  3. Reusable logic for downstream models
  4. No storage, no incremental logic
  5. Compile-time only

Flashcards / Q&A

Interview Talking Points


7. Importance of Ephemeral Models

  1. Clean warehouse: Avoid unnecessary tables or views.
  2. Reusable logic: Modularize SQL transformations, maintain a single source of truth.
  3. Faster iterations: No separate build steps; ephemeral SQL executes inline.
  4. Cost efficiency: No storage or materialization cost.
  5. Encourages maintainable design: Simplifies dependency chains in large projects.

In modern analytics projects, ephemeral models are critical for intermediate transformations and modular pipelines.


8. Common Pitfalls & Best Practices

PitfallExplanationHow to Avoid
Overcomplicating ephemeral logicVery complex transformations can slow downstream executionKeep ephemeral models simple; consider materializing if complex
Deep nestingMultiple ephemeral models nested in one downstream query can impact compile/execution timeLimit nesting depth; consider materializing heavy intermediate results
Debugging difficultiesCannot query ephemeral directlyUse dbt compile to inspect inlined SQL
Using ephemeral for huge datasetsLarge datasets may perform better as table or incrementalMaterialize intermediate models for big tables
Forgetting dependenciesReference upstream sources correctly via ref()Always use ref() for lineage and dependency tracking

Best Practices


9. Summary