dbt-unit-tests
dbt Unit Test Generation
Requirements
Agent: builder or migrator (requires file write access)
Tools used: dbt_unit_test_gen, dbt_manifest, dbt_lineage, altimate_core_validate, altimate_core_testgen, bash (runs altimate-dbt commands), read, glob, write, edit
When to Use This Skill
Use when the user wants to:
- Generate unit tests for a dbt model
- Add test coverage to an existing model
- Create mock data for testing
- Test-driven development (TDD) for dbt
- Verify CASE/WHEN logic, NULL handling, JOIN behavior, or aggregation correctness
- Test incremental model logic
Do NOT use for:
- Adding schema tests (not_null, unique, accepted_values) -> use
dbt-test - Creating or modifying model SQL -> use
dbt-develop - Writing descriptions -> use
dbt-docs - Debugging build failures -> use
dbt-troubleshoot
The Iron Rules
- Never guess expected outputs. Compute them by running SQL against mock data when possible. If you cannot run SQL, clearly mark expected outputs as placeholders that need verification.
- Never skip upstream dependencies. Every ref() and source() the model touches MUST have a mock input. Miss one and the test won't compile.
- Use sql format for ephemeral models. Dict format fails silently for ephemeral upstreams.
- Never weaken a test to make it pass. If the test fails, the model logic may be wrong. Investigate before changing expected values.
- Compile before committing. Always run
altimate-dbt test --model <name>to verify tests compile and execute.
Core Workflow: Analyze -> Generate -> Refine -> Validate -> Write
Phase 1: Analyze the Model
Before generating any tests, deeply understand the model:
# 1. Ensure manifest is compiled
altimate-dbt compile --model <name>
# 2. Read the model SQL
read <model_sql_file>
# 3. Parse the manifest for dependencies
dbt_unit_test_gen(manifest_path: "target/manifest.json", model: "<name>")
What to look for:
- Which upstream refs/sources does this model depend on?
- What SQL constructs need testing? (CASE/WHEN, JOINs, window functions, aggregations)
- What edge cases exist? (NULLs, empty strings, zero values, boundary dates)
- Is this an incremental model? (needs
is_incrementaloverride tests) - Are any upstream models ephemeral? (need sql format)
Phase 2: Generate Tests
The dbt_unit_test_gen tool does the heavy lifting:
dbt_unit_test_gen(
manifest_path: "target/manifest.json",
model: "fct_orders",
max_scenarios: 5
)
This returns:
- Complete YAML with mock inputs and expected outputs
- Semantic context: model/column descriptions, column lineage, compiled SQL
- List of anti-patterns that informed edge case generation
- Warnings about ephemeral deps, missing columns, etc.
If the tool reports missing columns (placeholder rows in the YAML), discover them:
altimate-dbt columns --model <upstream_model_name>
altimate-dbt columns-source --source <source_name> --table <table_name>
Then update the generated YAML with real column names.
Phase 3: Refine Expected Outputs
This is the critical step that differentiates good tests from bad ones.
The tool generates placeholder expected outputs based on column types. You MUST refine them:
Option A: Compute by running SQL (preferred)
# Run the model against mock data to get actual output
altimate-dbt test --model <name>
# If the test fails, the error shows actual vs expected — use actual as expected
Option B: Manual computation Read the model SQL carefully and mentally execute it against the mock inputs. For each test case:
- Look at the mock input rows
- Trace through the SQL logic (CASE/WHEN branches, JOINs, aggregations)
- Write the correct expected output
Option C: Use the warehouse (most accurate)
# Build a CTE query with mock data and run the model SQL against it
altimate-dbt execute --query "WITH mock_stg_orders AS (SELECT 1 AS order_id, 100.00 AS amount) SELECT * FROM (<model_sql>) sub"
Phase 4: Validate
# 1. Run the unit tests
altimate-dbt test --model <name>
# 2. If tests fail, read the error carefully
# - Compilation error? Missing ref, wrong column name, type mismatch
# - Assertion error? Expected output doesn't match actual
# 3. Fix and retry (max 3 iterations)
Phase 5: Write to File
Place unit tests in one of these locations (match project convention):
models/<layer>/_unit_tests.yml(dedicated file)models/<layer>/schema.yml(append to existing)
# Check existing convention
glob models/**/*unit_test*.yml models/**/*schema*.yml
# Write or append
edit <yaml_file> # if file exists
write <yaml_file> # if creating new
Test Case Categories
Happy Path (always generate)
Standard inputs that exercise the main logic path. 2 rows minimum.
NULL Handling
Set nullable columns to NULL in the last row. Verify COALESCE/NVL/IFNULL behavior.
Boundary Values
Zero amounts, empty strings, epoch dates, MAX values. Tests robustness.
Edge Cases
- Division by zero (if model divides)
- Non-matching JOINs (LEFT JOIN with no match)
- Single-row aggregation
- Duplicate key handling
Incremental
For incremental models only. Use overrides.macros.is_incremental: true to test the incremental path.
Common Mistakes
| Mistake | Fix |
|---|---|
| Missing a ref() in given | Parse manifest for ALL depends_on nodes |
| Wrong column names in mock data | Use manifest columns, not guesses |
| Wrong data types | Use schema catalog types |
| Expected output is just mock input | Actually compute the transformation |
| Dict format for ephemeral model | Use format: sql with raw SQL |
| Not testing NULL path in COALESCE | Add null_handling test case |
| Hardcoded dates with current_timestamp | Use overrides.macros to mock timestamps |
| Testing trivial pass-through | Skip models with no logic |
YAML Format Reference
unit_tests:
- name: test_<model>_<scenario>
description: "What this test verifies"
model: <model_name>
overrides: # optional
macros:
is_incremental: true # for incremental models
vars:
run_date: "2024-01-15" # for date-dependent logic
given:
- input: ref('upstream_model')
rows:
- { col1: value1, col2: value2 }
- input: source('source_name', 'table_name')
rows:
- { col1: value1 }
- input: ref('ephemeral_model')
format: sql
rows: |
SELECT 1 AS id, 'test' AS name
UNION ALL
SELECT 2 AS id, 'other' AS name
expect:
rows:
- { output_col1: expected1, output_col2: expected2 }
Reference Guides
| Guide | Use When |
|---|---|
| references/unit-test-yaml-spec.md | Full YAML specification and format details |
| references/edge-case-patterns.md | Catalog of edge cases by SQL construct |
| references/incremental-testing.md | Testing incremental models |
| references/altimate-dbt-commands.md | Full CLI reference |
More from altimateai/altimate-code
data-viz
>
6schema-migration
Analyze DDL migrations for data loss risks — type narrowing, missing defaults, dropped constraints, breaking column changes. Use before applying schema changes to production.
3dbt-test
Add schema tests, unit tests, and data quality checks to dbt models. Use when validating data integrity, adding test definitions to schema.yml, writing unit tests, or practicing test-driven development in dbt. Powered by altimate-dbt.
2dbt-docs
Document dbt models and columns in schema.yml with business context — model descriptions, column definitions, and doc blocks. Use when adding or improving documentation for discoverability. Powered by altimate-dbt.
2sql-review
Pre-merge SQL quality gate — lint 26 anti-patterns, grade readability/performance A-F, validate syntax, and scan for injection threats. Use before committing or reviewing SQL changes.
2dbt-troubleshoot
Debug dbt errors — compilation failures, runtime database errors, test failures, wrong data, and performance issues. Use when something is broken, producing wrong results, or failing to build. Powered by altimate-dbt.
2