# dbt — Data Build Tool for SQL Transformations

> Open-source framework for modeling, testing, and documenting SQL transformations in the modern data warehouse.

## Install

Save as a script file and run:

# dbt — Data Build Tool for the Modern Warehouse

## Quick Use
```bash
# Install dbt-core and an adapter (e.g., Snowflake, BigQuery, Postgres)
pip install dbt-postgres

# Scaffold a project
dbt init analytics && cd analytics

# Configure ~/.dbt/profiles.yml with your warehouse credentials, then:
dbt deps        # install packages
dbt seed        # load reference CSVs
dbt run         # materialize models
dbt test        # run schema + data tests
dbt docs generate && dbt docs serve  # browse lineage
```

## Introduction
dbt (data build tool) turns SQL into a software-engineering discipline for the analytics layer of your warehouse. Analytics engineers write modular `SELECT` statements, and dbt compiles them into views, tables, and incremental models with dependency graphs, tests, documentation, and CI — without ever leaving SQL.

## What dbt Does
- Compiles Jinja-templated SQL into warehouse-native DDL/DML.
- Manages DAGs of models with `ref()` so dependencies and scheduling are implicit.
- Runs data tests (not_null, unique, accepted_values, custom) against every table.
- Generates a lineage-aware docs site from the project and warehouse metadata.
- Packages reusable macros and models for sharing via the dbt Hub.

## Architecture Overview
`dbt-core` parses a project of `.sql` models, `schema.yml` tests, and `dbt_project.yml` config into a manifest — a DAG of nodes with compiled SQL, tests, exposures, and sources. Adapters (Postgres, Snowflake, BigQuery, Redshift, Databricks, Spark, DuckDB, Trino, ClickHouse, etc.) translate manifest nodes to warehouse-specific materializations and run them in topological order.

## Self-Hosting & Configuration
- Install just the adapter you need: `dbt-snowflake`, `dbt-bigquery`, `dbt-postgres`, etc.
- Keep credentials outside the repo in `~/.dbt/profiles.yml` or environment variables.
- Use `packages.yml` to pull in community packages like `dbt_utils`, `dbt_expectations`.
- Manage environments with target profiles: `dev`, `ci`, `prod`, each with its own schema.
- Deploy with dbt Cloud, Airflow, Dagster, Prefect, or GitHub Actions calling `dbt build`.

## Key Features
- `ref()` and `source()` macros for dependency-safe, environment-aware references.
- Incremental models, snapshots (SCD type 2), seeds, and analyses as first-class node types.
- Built-in tests + `dbt-expectations` for data quality.
- Jinja + Python (dbt-core 1.3+) models for lightweight warehouse-native ML and feature eng.
- Git-friendly: every model, test, and doc lives in version-controlled SQL/YAML.

## Comparison with Similar Tools
- **SQLMesh** — Newer; adds virtual environments and stronger diffing; dbt has the ecosystem.
- **Dataform** (Google) — Cloud-first and warehouse-specific; dbt is open and multi-warehouse.
- **Apache Airflow** — Orchestrator; dbt is the transformation layer that Airflow often runs.
- **Matillion / Fivetran Transformations** — UI-driven; dbt is code-first with full Git workflow.
- **Talend** — Classic ETL; dbt is ELT-in-warehouse with much lighter infrastructure.

## FAQ
**Q:** Is dbt an orchestrator?
A: No — dbt runs DAGs of SQL models. Schedule it with Airflow, Dagster, Prefect, or dbt Cloud.

**Q:** What warehouses are supported?
A: Snowflake, BigQuery, Redshift, Databricks, Postgres, DuckDB, ClickHouse, Trino, Spark, and many more via community adapters.

**Q:** How do tests work?
A: Declarative assertions in YAML compile to `SELECT` statements; `dbt test` fails when any row is returned.

**Q:** Can I use Python?
A: Yes — Python models are supported on Snowflake, Databricks, and BigQuery adapters starting in dbt-core 1.3.

## Sources
- https://github.com/dbt-labs/dbt-core
- https://docs.getdbt.com

---
Source: https://tokrepo.com/en/workflows/894f7271-3931-11f1-9bc6-00163e2b0d79
Author: Script Depot