# dbt — Data Build Tool for SQL Transformations > Open-source framework for modeling, testing, and documenting SQL transformations in the modern data warehouse. ## Install Save as a script file and run: # dbt — Data Build Tool for the Modern Warehouse ## Quick Use ```bash # Install dbt-core and an adapter (e.g., Snowflake, BigQuery, Postgres) pip install dbt-postgres # Scaffold a project dbt init analytics && cd analytics # Configure ~/.dbt/profiles.yml with your warehouse credentials, then: dbt deps # install packages dbt seed # load reference CSVs dbt run # materialize models dbt test # run schema + data tests dbt docs generate && dbt docs serve # browse lineage ``` ## Introduction dbt (data build tool) turns SQL into a software-engineering discipline for the analytics layer of your warehouse. Analytics engineers write modular `SELECT` statements, and dbt compiles them into views, tables, and incremental models with dependency graphs, tests, documentation, and CI — without ever leaving SQL. ## What dbt Does - Compiles Jinja-templated SQL into warehouse-native DDL/DML. - Manages DAGs of models with `ref()` so dependencies and scheduling are implicit. - Runs data tests (not_null, unique, accepted_values, custom) against every table. - Generates a lineage-aware docs site from the project and warehouse metadata. - Packages reusable macros and models for sharing via the dbt Hub. ## Architecture Overview `dbt-core` parses a project of `.sql` models, `schema.yml` tests, and `dbt_project.yml` config into a manifest — a DAG of nodes with compiled SQL, tests, exposures, and sources. Adapters (Postgres, Snowflake, BigQuery, Redshift, Databricks, Spark, DuckDB, Trino, ClickHouse, etc.) translate manifest nodes to warehouse-specific materializations and run them in topological order. ## Self-Hosting & Configuration - Install just the adapter you need: `dbt-snowflake`, `dbt-bigquery`, `dbt-postgres`, etc. - Keep credentials outside the repo in `~/.dbt/profiles.yml` or environment variables. - Use `packages.yml` to pull in community packages like `dbt_utils`, `dbt_expectations`. - Manage environments with target profiles: `dev`, `ci`, `prod`, each with its own schema. - Deploy with dbt Cloud, Airflow, Dagster, Prefect, or GitHub Actions calling `dbt build`. ## Key Features - `ref()` and `source()` macros for dependency-safe, environment-aware references. - Incremental models, snapshots (SCD type 2), seeds, and analyses as first-class node types. - Built-in tests + `dbt-expectations` for data quality. - Jinja + Python (dbt-core 1.3+) models for lightweight warehouse-native ML and feature eng. - Git-friendly: every model, test, and doc lives in version-controlled SQL/YAML. ## Comparison with Similar Tools - **SQLMesh** — Newer; adds virtual environments and stronger diffing; dbt has the ecosystem. - **Dataform** (Google) — Cloud-first and warehouse-specific; dbt is open and multi-warehouse. - **Apache Airflow** — Orchestrator; dbt is the transformation layer that Airflow often runs. - **Matillion / Fivetran Transformations** — UI-driven; dbt is code-first with full Git workflow. - **Talend** — Classic ETL; dbt is ELT-in-warehouse with much lighter infrastructure. ## FAQ **Q:** Is dbt an orchestrator? A: No — dbt runs DAGs of SQL models. Schedule it with Airflow, Dagster, Prefect, or dbt Cloud. **Q:** What warehouses are supported? A: Snowflake, BigQuery, Redshift, Databricks, Postgres, DuckDB, ClickHouse, Trino, Spark, and many more via community adapters. **Q:** How do tests work? A: Declarative assertions in YAML compile to `SELECT` statements; `dbt test` fails when any row is returned. **Q:** Can I use Python? A: Yes — Python models are supported on Snowflake, Databricks, and BigQuery adapters starting in dbt-core 1.3. ## Sources - https://github.com/dbt-labs/dbt-core - https://docs.getdbt.com --- Source: https://tokrepo.com/en/workflows/894f7271-3931-11f1-9bc6-00163e2b0d79 Author: Script Depot