dbt Model Dependencies and the ref() Function
When you are working with dbt at any meaningful scale, you quickly realize that the order your models run in matters a lot. A summary table that runs before its source staging model is going to fail every time. dbt solves this problem elegantly through the ref() function, which does two things at once: it tells dbt which model to read data from, and it registers a dependency that shapes the entire execution order.
This page walks through how that works in practice, with examples that reflect how teams actually structure their projects in 2025.
The Core Idea
Every SQL model in dbt lives in a .sql file. When one model needs data from another, you write:
select * from {{ ref('stg_orders') }}That single line does something important. dbt reads it during the compile phase, notes that the current model depends on stg_orders, and adds an edge in the DAG between the two. When you run dbt run, the framework figures out the correct sequence automatically — you never have to think about it manually.
This is fundamentally different from writing raw SQL against a warehouse where you have to know which table to reference and hope it is already populated.
How the DAG Gets Built
DAG stands for Directed Acyclic Graph. “Directed” means dependencies flow in one direction. “Acyclic” means no circular loops — model A cannot depend on model B if B depends on A.
When you call dbt run, the sequence looks like this:
1. dbt reads every .sql file in the models/ directory2. It scans for ref() and source() calls3. It builds a graph of nodes (models) and edges (dependencies)4. It runs models in topological order — parents before childrenA simple three-layer project looks like this in the graph:
raw_orders (source) | vstg_orders (staging) | vcustomer_orders (core) | vsales_summary (mart)Each arrow represents a ref() call in the downstream model. dbt guarantees that stg_orders finishes before customer_orders starts, and so on down the chain.
Practical Examples
Staging Layer — Cleaning Raw Data
The staging layer is where raw source tables get cleaned up. You reference source tables with source(), not ref(), because they live outside your dbt project.
-- models/staging/stg_orders.sqlselect id as order_id, customer_id, cast(created_at as date) as order_date, amount, statusfrom {{ source('raw', 'orders') }}where status != 'test'Core Layer — Joining Two Staging Models
Once staging models exist, the core layer can reference them using ref().
-- models/core/customer_orders.sqlselect c.customer_id, c.customer_name, count(o.order_id) as total_orders, sum(o.amount) as total_spent, max(o.order_date) as last_order_datefrom {{ ref('stg_customers') }} as cleft join {{ ref('stg_orders') }} as o on c.customer_id = o.customer_idgroup by 1, 2dbt sees two ref() calls here. Both stg_customers and stg_orders become upstream dependencies. Either one failing will stop this model from running.
Mart Layer — Business-Ready Aggregations
The mart layer is what analysts and BI tools typically connect to.
-- models/marts/sales_summary.sqlselect region, date_trunc('month', order_date) as month, sum(total_spent) as revenue, count(distinct customer_id) as unique_customersfrom {{ ref('customer_orders') }}group by 1, 2This creates a clean three-level chain: staging → core → mart.
Using ref() Inside CTEs
Many teams prefer CTEs over subqueries for readability. ref() works naturally inside them.
with customers as ( select * from {{ ref('stg_customers') }} ), orders as ( select * from {{ ref('stg_orders') }} )
select customers.customer_id, customers.email, orders.amountfrom customersjoin orders using (customer_id)This keeps the SQL flat and easy to read, while still giving dbt the dependency information it needs.
ref() in Incremental Models
One of the most common patterns in 2025 dbt projects is combining ref() with incremental materializations to handle large tables efficiently.
-- models/core/daily_revenue.sql{{ config(materialized='incremental', unique_key='order_date') }}
select order_date, sum(amount) as daily_revenuefrom {{ ref('stg_orders') }}
{% if is_incremental() %} where order_date > (select max(order_date) from {{ this }}){% endif %}
group by 1dbt only processes new rows on subsequent runs. The {{ this }} macro refers to the current model’s existing table in the warehouse, and ref('stg_orders') still registers the upstream dependency correctly.
Cross-Project References (dbt Mesh)
Since dbt Mesh became production-ready in 2024 and matured further in 2025, teams can now reference models across separate dbt projects. The syntax uses a two-part identifier:
select * from {{ ref('finance_project', 'monthly_revenue') }}This pulls data from a model in the finance_project dbt project. The upstream project must expose the model as a public node in its dbt_project.yml. Cross-project refs enforce the same dependency tracking as regular refs, just across project boundaries.
Visualizing Dependencies
After writing your models, you can generate documentation that includes an interactive DAG:
dbt docs generatedbt docs serveOpen localhost:8080 in your browser and click the lineage graph icon on any model. You will see its full upstream and downstream chain. This is one of the most practical debugging tools dbt offers — when a model breaks, the lineage view tells you immediately what else will be affected.
Common Mistakes
Hardcoding table names instead of using ref()
-- Wrongselect * from my_schema.stg_orders
-- Rightselect * from {{ ref('stg_orders') }}Hardcoded names break when schemas change across environments. ref() resolves the correct schema automatically based on your target.
Creating circular dependencies
If model A references B and B references A, dbt will throw an error: Found a cycle. This usually means you need to restructure — often by extracting shared logic into a third model that both can reference.
Inconsistent naming
If a model file is named staging_orders.sql but you reference it as ref('stg_orders'), the run will fail with a model-not-found error. The string inside ref() must match the filename exactly (without the .sql extension).
Dependency Execution Summary
| Concept | What It Does |
|---|---|
ref('model_name') | Declares a dependency and resolves the correct relation in the warehouse |
| DAG | The directed graph dbt builds from all ref() calls to determine run order |
| Topological sort | The algorithm dbt uses to execute models parents-first |
dbt docs serve | Shows the interactive lineage graph for your entire project |
| Cross-project ref | References a public model from another dbt project (dbt Mesh) |
The ref() function is the single most important concept in dbt. Everything from execution order to environment portability to lineage documentation flows through it. Getting comfortable with it early makes every other dbt feature easier to understand.