Data Engineering  /  dbt

🔄 dbt — Data Build Tool 23 guides · updated 2026

Analytics engineering with SQL — models, tests, sources, and Jinja macros that turn raw warehouse tables into trustworthy, documented data products.

dbt Environment Management: Dev, Staging, and Production

One of the biggest risks when working with a live data pipeline is accidentally running development code against production data. A schema change in a model, a filter removed during debugging, a join condition that is wrong — any of these can corrupt dashboards or data products that people depend on. Environment management in dbt is how you prevent that by keeping development, validation, and production work completely isolated.


The Core Concept: Targets

In dbt Core, a target is a named configuration in profiles.yml that describes how dbt connects to your warehouse. You can define as many targets as you need and switch between them using the --target flag.

Most teams use at minimum two targets — dev and prod — and many add a staging target between them.

~/.dbt/profiles.yml
my_project:
target: dev
outputs:
dev:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
database: analytics_dev
schema: dbt_dev
warehouse: TRANSFORMING_XS
role: DEVELOPER
staging:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER_STAGING') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD_STAGING') }}"
database: analytics_staging
schema: dbt_staging
warehouse: TRANSFORMING_S
role: STAGING_ROLE
prod:
type: snowflake
account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
user: "{{ env_var('SNOWFLAKE_USER_PROD') }}"
password: "{{ env_var('SNOWFLAKE_PASSWORD_PROD') }}"
database: analytics_prod
schema: dbt_prod
warehouse: TRANSFORMING_M
role: PROD_ROLE

Each target points to a different database or schema, uses different credentials, and often a different warehouse size to control cost.

Running against each environment:

Terminal window
dbt run # uses the default target (dev)
dbt run --target staging # runs against staging
dbt run --target prod # runs against production

You can also set the target via environment variable, which is useful in CI/CD:

Terminal window
export DBT_TARGET=prod
dbt run

The Promotion Flow

A standard promotion pipeline looks like this:

Developer feature branch
|
v
dev target (personal schema)
Fast feedback, limited data, no impact on others
|
v (PR merged)
staging target
Full data, mirrors prod, automated tests run
|
v (tests pass, approval granted)
prod target
Live data, read by BI tools and downstream consumers

This separation means:


Per-Developer Schema Isolation

In a team of five or ten analysts all working in the same dev target, model name collisions become a problem. If two developers are both working on stg_orders, they will overwrite each other’s work.

The solution is to give each developer their own schema prefix. You do this by overriding the generate_schema_name macro:

-- macros/generate_schema_name.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
{% if target.name == 'dev' %}
{{ env_var('DBT_USER', 'shared') }}_{{ custom_schema_name | trim }}
{% else %}
{{ custom_schema_name | trim }}
{% endif %}
{%- endmacro %}

With this macro, a developer named alice who sets DBT_USER=alice will write to schemas like alice_staging and alice_marts. A developer named bob writes to bob_staging and bob_marts. They never collide.

In production, the schema names are unchanged — the DBT_USER prefix only applies in the dev target.


Environment-Aware Model Logic

Models often need to behave differently based on which environment they are running in. The target.name variable gives you this:

Limiting data volume in dev:

-- models/core/fct_events.sql
select
event_id,
user_id,
event_type,
occurred_at
from {{ ref('stg_events') }}
{% if target.name == 'dev' %}
where occurred_at >= current_date - 14
{% endif %}

In dev, only the last 14 days of events load. The query is fast and cheap. In staging and production, the full table processes.

Using different data volumes per environment:

{% if target.name == 'dev' %}
where occurred_at >= current_date - 7
{% elif target.name == 'staging' %}
where occurred_at >= current_date - 90
{% endif %}
-- prod: no where clause, full data

Staging gets three months of data — enough to validate business logic without the expense of full production data.


Deferral: Avoiding Full Rebuilds in CI

When you open a pull request, you usually only change one or two models. Running your full dbt project in CI to validate those two models wastes time and money.

dbt’s defer feature solves this by letting you reference existing artifacts from a previous production run for unchanged models, instead of rebuilding them from scratch.

In dbt Core, you use the --defer and --state flags:

Terminal window
# Step 1: Download production artifacts (manifest.json) from the last successful prod run
# Step 2: Run only changed models, deferring everything else to prod
dbt run --defer --state ./prod_artifacts/ --select state:modified+

The state:modified+ selector identifies models that changed in this PR. --defer tells dbt to use the production version of all unchanged upstream models instead of building them. A PR that changes fct_revenue might only need to rebuild fct_revenue and its downstream dependents, not the entire staging layer.


dbt Cloud Environments

In dbt Cloud, the concept expands from targets to environments. Each environment in dbt Cloud has:

The key distinction dbt Cloud adds is a designated Production environment. Marking an environment as Production tells dbt Cloud to use its compiled artifacts as the baseline for deferred runs in other environments. The Slim CI feature in dbt Cloud uses this automatically — when a PR is opened, it runs only modified models against the staging environment while deferring to the Production environment’s artifacts for everything else.

Setting up environments in dbt Cloud:

  1. Go to Deploy > Environments
  2. Create a “Production” environment pointed at your main branch with production credentials
  3. Create a “Staging” environment pointed at your staging branch with staging credentials
  4. Create jobs in each environment — the Prod job runs on a schedule, the Staging job runs on PR open

CI/CD Integration with GitHub Actions

For dbt Core users who prefer to keep the full CI pipeline in code:

.github/workflows/dbt_ci.yml
name: dbt CI
on:
pull_request:
branches: [main]
jobs:
dbt_test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dbt
run: pip install dbt-snowflake
- name: Download production artifacts
run: |
mkdir prod_artifacts
# Pull manifest.json from your artifact storage (S3, GCS, etc.)
aws s3 cp s3://your-bucket/dbt-artifacts/manifest.json prod_artifacts/
- name: Run dbt (modified models only)
env:
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER_STAGING }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD_STAGING }}
DBT_USER: ${{ github.actor }}
run: |
dbt deps
dbt build --target staging --defer --state ./prod_artifacts/ --select state:modified+

The github.actor variable populates DBT_USER with the PR author’s GitHub username, automatically creating a personal schema for their changes in the staging environment.


Environment Variables for Credentials

Credentials should never be hardcoded in profiles.yml or committed to git. Use environment variables:

# profiles.yml (safe to commit — no actual credentials)
my_project:
target: dev
outputs:
prod:
type: bigquery
method: service-account
project: "{{ env_var('GCP_PROJECT') }}"
dataset: "{{ env_var('BQ_DATASET') }}"
keyfile: "{{ env_var('GOOGLE_APPLICATION_CREDENTIALS') }}"
threads: 4

Set the actual values in your CI/CD secrets manager (GitHub Actions secrets, AWS Secrets Manager, HashiCorp Vault) and pass them as environment variables at runtime. This keeps credentials out of source code entirely.


Common Environment Management Mistakes

Using the same schema for dev and prod — if your dev and prod targets both write to the same schema, development work will overwrite production tables. Always use distinct schemas per environment.

Hardcoding target.name == 'prod' checks in too many models — this makes models harder to read. A single macro that encapsulates environment behavior is cleaner than scattered conditionals.

No staging environment — going directly from dev to prod without validation means any logic errors will hit your production data before you catch them. Staging catches these before they matter.

Rebuilding everything in CI — CI pipelines that run dbt build without state-based selection are slow and expensive. Use state:modified+ with deferred artifacts.

Forgetting to update credentials when environments change — if a warehouse account changes in production, update the prod target immediately. Using environment variables makes this rotation easier.


Summary

Environment management in dbt centers on three ideas: isolating schemas so development work cannot affect production, using targets or Cloud environments to point different runs at different warehouse configurations, and applying environment-aware logic in models to keep dev fast and cheap while prod runs complete data.

The promotion path from dev to staging to prod, with CI validation at the staging gate, is the standard pattern for teams that take data quality seriously. Deferred runs make this practical by avoiding the cost of rebuilding unchanged models on every pull request.