Test-Driven Development

A discipline for growing software guided by tests, one small step at a time.

When to Activate

Use this skill when:

Writing code test-first for a new feature or module
Starting a greenfield project and need a walking skeleton
Applying outside-in development from acceptance tests to unit tests
Sequencing tests to drive incremental design
Creating a test list to plan implementation
Reviewing whether tests are giving good design feedback
Deciding between Fake It, Obvious Implementation, or Triangulation

The TDD Cycle

Five Steps

Write a test list — brainstorm the tests you think you'll need
Write one failing test — pick the next test from the list
Make it pass — write the simplest code that makes the test green
Refactor — remove duplication and improve design, keeping tests green
Repeat — pick the next test, update the list as you learn

Red-Green-Refactor

The micro-cycle is a three-state machine:

    ┌──────────────────────────────────────┐
    │                                      │
    ▼                                      │
  [RED] ──── write minimal code ────► [GREEN] ──── refactor ────► [GREEN]
    ▲                                                                │
    │                                                                │
    └──────────── write next failing test ◄──────────────────────────┘

RED: Write a test that fails. Confirm it fails for the right reason — the missing behavior, not a syntax error or wrong import.

GREEN: Make the test pass with the simplest change possible. "Sinful" code is fine — hardcoded values, copy-paste, whatever gets green fastest.

REFACTOR: Clean up while all tests stay green. Remove duplication between test and production code. Improve names. Extract methods. This is where design emerges.

Rules:

Never write production code without a failing test
Never write more test code than needed to fail
Never write more production code than needed to pass

The Three Laws

Uncle Bob's formalization of Beck's constraints:

You shall not write production code unless you have a failing test
You shall not write more of a test than is sufficient to fail (including compile failures)
You shall not write more production code than is sufficient to pass the currently failing test

These laws enforce the micro-cycle. They keep steps small and feedback immediate.

Test List

A test list is a brainstorm of all the tests you think you'll need, written before you start coding. It is your roadmap.

How to Create One

Think about the behavior you need to implement
List the specific cases: happy path, edge cases, error cases
Order them from simplest to most complex
Start with a degenerate case if possible

Example: Stack

Test List:
- [ ] new stack is empty
- [ ] push one item, stack is not empty
- [ ] push one item then pop, returns that item
- [ ] push two items then pop, returns second item
- [ ] pop empty stack raises error
- [ ] push and pop multiple items (LIFO order)
- [ ] peek returns top without removing

When to Update

The test list is alive. As you work:

Cross off completed tests
Add new tests you discover along the way
Remove tests that turn out to be redundant
Reorder if a different sequence makes more sense

A test list can seed beads tasks — create one task per test with skill:tdd labels for tracking.

Strategies for Getting to Green

Three strategies for making a failing test pass, in order of safety:

Fake It

Return a hardcoded value that makes the test pass. Then write the next test to force generalization.

# Test: add(1, 2) returns 3
# Fake It:
def add(a, b):
    return 3

# Next test forces real implementation:
# Test: add(3, 4) returns 7

When to use: When you're unsure how to implement the real thing. When the step to real code feels too big. When you want maximum safety.

Obvious Implementation

Type the real implementation directly, because it's clear what the code should be.

# Test: add(1, 2) returns 3
# Obvious Implementation:
def add(a, b):
    return a + b

When to use: When the implementation is trivially obvious. When you're confident. If you get an unexpected failure, fall back to Fake It.

Triangulation

Use two or more test cases to force removal of hardcoded values and drive toward the general solution.

# Test 1: add(1, 2) returns 3
def add(a, b):
    return 3           # fake it

# Test 2: add(3, 4) returns 7
def add(a, b):
    return a + b       # now forced to generalize

When to use: When you're uncertain about the abstraction. When two examples make the pattern clearer than one. When you need confidence before generalizing.

Test Sequencing

"As tests get more specific, code gets more generic." — Robert C. Martin

Start with degenerate cases and progress toward forcing generalization:

Degenerate/boundary cases — null, empty, zero, one element
Simple happy path — the most basic successful case
Variations — different inputs that exercise the same path
Edge cases — boundaries, maximums, special values
Error cases — invalid input, missing data, failure conditions

Each new test should require a small, incremental change to the production code. If a test requires a large change, you skipped a step — find a simpler test to write first.

Sequencing Example: FizzBuzz

Test List (ordered):
1. returns "1" for 1          → hardcode "1"
2. returns "2" for 2          → return string of number
3. returns "Fizz" for 3       → add modulo-3 check
4. returns "Fizz" for 6       → confirms generalization
5. returns "Buzz" for 5       → add modulo-5 check
6. returns "Buzz" for 10      → confirms generalization
7. returns "FizzBuzz" for 15  → add modulo-15 check
8. returns "FizzBuzz" for 30  → confirms generalization

Double-Loop TDD

Freeman & Pryce's model from Growing Object-Oriented Software, Guided by Tests:

  Outer Loop (Acceptance Test)
  ┌──────────────────────────────────────────────────┐
  │                                                  │
  │  Write failing          Acceptance test passes   │
  │  acceptance test ──────────────────► Done         │
  │       │                      ▲                   │
  │       ▼                      │                   │
  │   Inner Loop (Unit Tests)    │                   │
  │   ┌────────────────────┐     │                   │
  │   │ RED → GREEN →      │     │                   │
  │   │ REFACTOR → repeat  │─────┘                   │
  │   └────────────────────┘                         │
  │                                                  │
  └──────────────────────────────────────────────────┘

Outer loop: Write a failing end-to-end acceptance test that describes the feature from the user's perspective. This test stays red while you build the internals.

Inner loop: Use the standard Red-Green-Refactor cycle to implement the components needed to make the acceptance test pass.

Walking Skeleton

Start with the thinnest possible slice that exercises the full architecture:

Write a failing acceptance test for the simplest end-to-end scenario
Build just enough of each layer to make it pass — UI, service, persistence
Deploy the skeleton to a production-like environment
Every subsequent feature builds on this proven foundation

The walking skeleton proves your architecture works before you invest in features. It's the first acceptance test in the outer loop.

Outside-In Development

Start at the system boundary and work inward, discovering collaborators through tests.

The Process

Start at the boundary — write a test for the entry point (HTTP handler, CLI command, message consumer)
Discover collaborators — when the boundary object needs help, define an interface for the collaborator
Test the collaborator — drop down one level and TDD the collaborator
Repeat inward — each layer discovers the next through its tests
Integrate — wire real implementations together; the acceptance test goes green

Tell, Don't Ask

Prefer commands over queries. Tell objects what to do rather than asking for data and acting on it:

# Ask (fragile — coupled to internal structure):
if order.status == "paid" and order.items_in_stock():
    warehouse.ship(order.items)

# Tell (robust — delegates to the object that knows):
order.fulfill(warehouse)

Ports and Adapters

Separate domain logic from infrastructure for testability:

          ┌─────────────────────────┐
 HTTP ───►│ Adapter                 │
          │   └► Port (interface)   │
          │        └► Domain Logic  │
          │   ┌► Port (interface)   │
          │ Adapter                 │◄─── Database
          └─────────────────────────┘

Ports: Interfaces defined by the domain (what it needs)
Adapters: Implementations that connect to infrastructure (how it's provided)
Tests can substitute adapters with test doubles, keeping tests fast and isolated

Listen to the Tests

When tests are hard to write, they're telling you something about your design. Difficulty in testing is a symptom of a design problem.

Difficulty Signal	Probable Design Issue	Suggested Refactoring
Test needs many objects to set up	Class has too many dependencies	Extract class, introduce facade
Test setup is deeply nested	Object graph is too complex	Flatten hierarchy, use composition
Hard to name the test	Method does too many things	Extract method, single responsibility
Test needs to access private state	Public interface is insufficient	Improve public API, add query method
Many tests break for one change	High coupling between classes	Introduce interface, dependency inversion
Slow tests (not integration)	Hidden I/O or expensive operations	Extract port/adapter, inject dependency
Test requires complex mocking	Violation of Law of Demeter	Wrap and delegate, tell don't ask
Test duplicates production logic	Missing abstraction	Extract shared concept
Can't test in isolation	Static calls, global state, `new`	Inject dependencies, use factory

Craftsmanship Principles

From the Software Craftsmanship Manifesto, applied to TDD:

Well-crafted software — TDD produces code that works, is clean, and communicates intent
Steadily adding value — each green test is a verified increment of working software
A community of professionals — TDD is a shared discipline, not a personal preference
Productive partnerships — tests document behavior for the team; they're a communication tool

TDD is a professional practice. Not every line of code requires it, but when you practice it, practice it with discipline. Half-hearted TDD (writing tests but skipping refactoring, or testing after the fact) delivers little of the benefit.

Language Testing Skills

TDD teaches when and why to write tests. Language-specific skills teach how to use the testing framework. Load both when practicing TDD in a specific language.

TDD Concept	Elixir (`elixir-testing`)	Rust (`rust`)	Zig (`zig`)
Write a failing test	`test "name" do ... end`	`#[test] fn name()`	`test "name" = \|\| { ... }`
Assertions	`assert`, `assert_receive`	`assert!`, `assert_eq!`	`try expect(...)`
Test isolation	`async: true`, sandbox	Module-level isolation	Test allocator
Test doubles	Mox for behaviours	Trait-based injection	Comptime interfaces
Property tests	StreamData	proptest, quickcheck	N/A
Test organization	`describe` blocks, tags	Module hierarchy, `#[cfg(test)]`	Nested test blocks

Anti-Patterns

Test-After Development

Writing code first, tests second. You lose the design feedback loop — tests conform to the code rather than driving it. Tests become verification scripts, not design tools.

Writing Too Many Tests Before Going Green

Writing five tests at once, then trying to make them all pass. You lose the tight feedback cycle and can't triangulate. Write one test, make it pass, then write the next.

Skipping the Refactor Step

Going from green straight to the next test. Duplication accumulates. Design degrades. The codebase becomes harder to change, and TDD feels slower than it should. Refactoring is where TDD pays for itself.

Testing Implementation Details

Testing how code works rather than what it does. Brittle tests that break when you refactor internals. Test behavior through the public interface.

Gold-Plating (YAGNI)

Adding behavior not driven by a test. Adding tests for scenarios nobody asked for. If it's not on the test list and not an edge case you discovered, you ain't gonna need it.

Ice Cream Cone

Many end-to-end tests, few unit tests. Invert the pyramid — most tests should be fast unit tests. End-to-end tests verify integration, not logic.

References

For deeper theory and worked examples:

references/beck-tdd.md — Kent Beck's canonical TDD: Fake It, Triangulation, test isolation
references/goos-outside-in.md — Freeman & Pryce's GOOS: double-loop TDD, walking skeleton, outside-in design

Key Principles

Write the test first — always
Keep steps small — if it feels like a leap, find a smaller step
Red-Green-Refactor is a discipline, not a suggestion — never skip refactor
The test list is your roadmap — update it as you learn
Listen to the tests — difficulty testing signals design problems
Fake It when unsure, Obvious Implementation when confident
As tests get more specific, code gets more generic
Walking skeleton first — prove the architecture before building features
Outside-in discovers interfaces — let tests drive the design inward
TDD is about design, not just verification

tdd