dbt Project Structure: What Lives Where and Why
Running dbt init my_project creates a directory with a specific layout. Every folder in that layout has a purpose, and understanding them early saves you a lot of confusion later. This page walks through the standard dbt project structure, what each component does, and how real teams organize things at scale.
The Default Layout
my_project/├── models/│ ├── staging/│ │ └── stg_orders.sql│ ├── intermediate/│ │ └── int_customer_orders.sql│ └── marts/│ └── fct_revenue.sql├── macros/│ └── cents_to_dollars.sql├── seeds/│ └── country_codes.csv├── snapshots/│ └── customers_snapshot.sql├── analyses/│ └── ad_hoc_revenue_check.sql├── tests/│ └── assert_positive_revenue.sql├── dbt_project.yml└── profiles.yml (usually in ~/.dbt/, not the project)Each of these serves a different role in the pipeline. Let us go through them one by one.
models/
The models/ directory is where all your transformation logic lives. Every .sql file here becomes a node in the DAG — either a view, table, incremental table, or ephemeral CTE depending on configuration.
The most common way to organize models is by layer:
models/├── staging/ ← one model per raw source table, minimal transformation├── intermediate/ ← joins, enrichment, business logic└── marts/ ← aggregated, BI-ready outputsEach layer builds on the one above it. Staging models read from source() declarations. Intermediate and mart models read from other models using ref().
A simple staging model looks like this:
-- models/staging/stg_orders.sqlselect id as order_id, customer_id, cast(created_at as date) as order_date, amount_cents / 100.0 as amount_usd, lower(status) as statusfrom {{ source('raw', 'orders') }}And a schema file in the same folder documents and tests it:
version: 2
models: - name: stg_orders description: "Cleaned orders from the raw transactional database" columns: - name: order_id tests: - not_null - unique - name: status tests: - accepted_values: values: ['placed', 'shipped', 'returned', 'cancelled']Tests defined here run with dbt test. Keeping the schema file next to the models it describes — rather than in one central file — makes it much easier to maintain as projects grow.
macros/
Macros are Jinja functions you write once and reuse across models. They live in the macros/ directory and are available to all models in the project automatically.
A common use case is wrapping a repeated calculation:
-- macros/cents_to_dollars.sql{% macro cents_to_dollars(column_name) %} {{ column_name }} / 100.0{% endmacro %}Then in any model:
select order_id, {{ cents_to_dollars('amount_cents') }} as amount_usdfrom {{ ref('stg_orders') }}Macros also handle more complex patterns — dynamic SQL generation, environment-specific logic, or generating repetitive clauses across multiple columns. The dbt-utils package (which you install via packages.yml) provides a large library of ready-made macros that most teams use to avoid reinventing common patterns.
seeds/
Seeds are CSV files that dbt loads into your warehouse as tables. They are ideal for small, static reference data that changes infrequently — country codes, product categories, marketing channel mappings, exchange rates.
seeds/└── country_codes.csvLoad them with:
dbt seeddbt creates a table in your warehouse matching the CSV filename. You can reference it in models just like any other dbt object:
select * from {{ ref('country_codes') }}One important note: seeds are not designed for large datasets. If a CSV file has more than a few thousand rows, loading it via seed will be slow. For larger lookup tables, load them to your warehouse separately and declare them as sources instead.
snapshots/
Snapshots capture the historical state of a table over time — a feature sometimes called slowly changing dimensions (SCDs). If you need to know what a customer record looked like six months ago, snapshots handle that.
-- snapshots/customers_snapshot.sql{% snapshot customers_snapshot %}
{{ config( target_schema='snapshots', unique_key='customer_id', strategy='timestamp', updated_at='updated_at') }}
select * from {{ source('raw', 'customers') }}
{% endsnapshot %}Run with dbt snapshot. Each execution checks for changes and appends new rows with dbt_valid_from and dbt_valid_to timestamps to track when each version was active.
Two snapshot strategies exist:
timestamp— detects changes by comparing anupdated_atcolumncheck— detects changes by comparing values in specified columns
analyses/
The analyses/ folder is for SQL files you want version-controlled but not materialized as warehouse objects. Useful for:
- Ad-hoc investigative queries
- Complex SQL you are developing before turning into a model
- One-off reports that finance or ops teams request
-- analyses/q4_revenue_check.sqlselect date_trunc('month', order_date) as month, sum(amount_usd) as revenuefrom {{ ref('fct_revenue') }}where order_date >= '2025-10-01'group by 1order by 1Run dbt compile to resolve the Jinja and generate runnable SQL in target/compiled/. You can then copy that SQL and run it directly in your warehouse client.
tests/
While most tests live in schema YAML files, the tests/ directory holds singular tests — custom SQL queries that return rows when something is wrong.
-- tests/assert_revenue_positive.sqlselect order_id, amount_usdfrom {{ ref('fct_revenue') }}where amount_usd < 0If this query returns any rows, the test fails. Singular tests are useful for business logic rules that are hard to express with the built-in test types (not_null, unique, accepted_values, relationships).
dbt_project.yml
This is the configuration file that makes a folder a dbt project. It sits at the root and controls global behavior.
name: my_projectversion: '1.0.0'profile: my_profile
model-paths: ["models"]seed-paths: ["seeds"]macro-paths: ["macros"]snapshot-paths: ["snapshots"]analysis-paths: ["analyses"]test-paths: ["tests"]
models: my_project: staging: +materialized: view +tags: ["staging"] intermediate: +materialized: table marts: +materialized: table +tags: ["mart"]The models: section applies configuration by folder path. The + prefix means “apply this to all models in this path.” This avoids having to set materialization in every individual model file.
How the Pieces Connect at Runtime
When you run dbt run, here is what happens:
dbt_project.yml → sets configuration and paths | vmodels/ scanned → all .sql files discovered | vref() and source() → dependency graph built | vtopological sort → determines execution order | vJinja compiled → SQL resolved to final form | vwarehouse execution → tables and views created | vtarget/run/ → compiled SQL saved for inspectionThe target/ directory is generated automatically and should not be committed to version control. Add it to .gitignore.
Multi-Domain Organization
For larger teams, the standard single-folder structure can get unwieldy. A common pattern in 2025 is organizing by domain within the layers:
models/├── staging/│ ├── finance/│ │ └── stg_invoices.sql│ └── marketing/│ └── stg_campaigns.sql├── marts/│ ├── finance/│ │ └── fct_revenue.sql│ └── marketing/│ └── fct_campaign_performance.sqlYou can apply dbt_project.yml configuration at any subfolder level:
models: my_project: marts: finance: +tags: ["finance", "mart"] marketing: +tags: ["marketing", "mart"]With dbt Mesh (stable since 2024), you can take this further and split large projects into separate dbt projects that share models through cross-project references. Each domain team owns their project while exposing specific public models to others.
Naming Conventions That Most Teams Follow
| Prefix | Layer | Example |
|---|---|---|
stg_ | Staging | stg_orders, stg_customers |
int_ | Intermediate | int_customer_orders |
fct_ | Fact table (mart) | fct_revenue, fct_sessions |
dim_ | Dimension table (mart) | dim_customers, dim_products |
mrt_ | General mart | mrt_weekly_summary |
These prefixes are convention, not enforced by dbt. But they make the DAG much easier to read at a glance, and most teams adopt them because the clarity is worth the extra characters.
What to Put in .gitignore
target/dbt_packages/logs/.envprofiles.ymlThe target/ and dbt_packages/ directories are generated at runtime and should never be committed. The profiles.yml file contains credentials and should never be in source control — use environment variables or dbt Cloud’s managed credentials instead.
Quick Reference
| Location | Purpose |
|---|---|
models/ | SQL transformation files |
models/*/schema.yml | Documentation, column descriptions, tests |
macros/ | Reusable Jinja/SQL functions |
seeds/ | Static CSV reference data |
snapshots/ | Historical change tracking (SCD) |
analyses/ | Version-controlled ad-hoc queries |
tests/ | Custom singular data quality tests |
dbt_project.yml | Project configuration and folder-level settings |
packages.yml | External dbt package dependencies |
target/ | Generated output — do not commit |
Getting comfortable with this layout is the first step to working efficiently in dbt. Once you know where things belong, both writing models and debugging broken runs becomes significantly faster.