ScriptsApr 16, 2026·3 min read

dbt — Data Build Tool for SQL Transformations

Open-source framework for modeling, testing, and documenting SQL transformations in the modern data warehouse.

Introduction

dbt (data build tool) turns SQL into a software-engineering discipline for the analytics layer of your warehouse. Analytics engineers write modular SELECT statements, and dbt compiles them into views, tables, and incremental models with dependency graphs, tests, documentation, and CI — without ever leaving SQL.

What dbt Does

  • Compiles Jinja-templated SQL into warehouse-native DDL/DML.
  • Manages DAGs of models with ref() so dependencies and scheduling are implicit.
  • Runs data tests (not_null, unique, accepted_values, custom) against every table.
  • Generates a lineage-aware docs site from the project and warehouse metadata.
  • Packages reusable macros and models for sharing via the dbt Hub.

Architecture Overview

dbt-core parses a project of .sql models, schema.yml tests, and dbt_project.yml config into a manifest — a DAG of nodes with compiled SQL, tests, exposures, and sources. Adapters (Postgres, Snowflake, BigQuery, Redshift, Databricks, Spark, DuckDB, Trino, ClickHouse, etc.) translate manifest nodes to warehouse-specific materializations and run them in topological order.

Self-Hosting & Configuration

  • Install just the adapter you need: dbt-snowflake, dbt-bigquery, dbt-postgres, etc.
  • Keep credentials outside the repo in ~/.dbt/profiles.yml or environment variables.
  • Use packages.yml to pull in community packages like dbt_utils, dbt_expectations.
  • Manage environments with target profiles: dev, ci, prod, each with its own schema.
  • Deploy with dbt Cloud, Airflow, Dagster, Prefect, or GitHub Actions calling dbt build.

Key Features

  • ref() and source() macros for dependency-safe, environment-aware references.
  • Incremental models, snapshots (SCD type 2), seeds, and analyses as first-class node types.
  • Built-in tests + dbt-expectations for data quality.
  • Jinja + Python (dbt-core 1.3+) models for lightweight warehouse-native ML and feature eng.
  • Git-friendly: every model, test, and doc lives in version-controlled SQL/YAML.

Comparison with Similar Tools

  • SQLMesh — Newer; adds virtual environments and stronger diffing; dbt has the ecosystem.
  • Dataform (Google) — Cloud-first and warehouse-specific; dbt is open and multi-warehouse.
  • Apache Airflow — Orchestrator; dbt is the transformation layer that Airflow often runs.
  • Matillion / Fivetran Transformations — UI-driven; dbt is code-first with full Git workflow.
  • Talend — Classic ETL; dbt is ELT-in-warehouse with much lighter infrastructure.

FAQ

Q: Is dbt an orchestrator? A: No — dbt runs DAGs of SQL models. Schedule it with Airflow, Dagster, Prefect, or dbt Cloud.

Q: What warehouses are supported? A: Snowflake, BigQuery, Redshift, Databricks, Postgres, DuckDB, ClickHouse, Trino, Spark, and many more via community adapters.

Q: How do tests work? A: Declarative assertions in YAML compile to SELECT statements; dbt test fails when any row is returned.

Q: Can I use Python? A: Yes — Python models are supported on Snowflake, Databricks, and BigQuery adapters starting in dbt-core 1.3.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets