TDD Workflow Skill

Overview

Test-Driven Development workflow enforcement. This skill ensures code is written test-first, following the Red-Green-Refactor cycle for higher quality and fewer bugs.

Type

technique

When to Invoke

Trigger keywords: TDD, test-driven, tests first, test coverage, red-green-refactor, write tests

Trigger phrases:

"write tests for..."
"test-driven development"
"implement with TDD"
"tests first approach"

Core Mandate

Write the test BEFORE the implementation.

This is not optional. TDD produces better designs because:

Forces thinking about interface before implementation
Catches edge cases early
Creates documentation through tests
Enables safe refactoring

The Red-Green-Refactor Cycle

┌─────────────────────────────────────────┐
│                                         │
│    ┌─────┐      ┌───────┐      ┌──────┐ │
│    │ RED │ ───► │ GREEN │ ───► │REFAC-│ │
│    │     │      │       │      │ TOR  │ │
│    └─────┘      └───────┘      └──────┘ │
│       ▲                           │     │
│       └───────────────────────────┘     │
│                                         │
└─────────────────────────────────────────┘

1. RED: Write a Failing Test

Before writing ANY code:

Define expected behavior
Write test that expresses this behavior
Run test - it MUST fail
If test passes, you wrote the wrong test

# Example: Testing a function that doesn't exist yet
def test_add_numbers_returns_sum():
    result = add_numbers(2, 3)  # Function doesn't exist!
    assert result == 5

2. GREEN: Write Minimal Code to Pass

Write ONLY enough code to make the test pass:

No extra features
No optimizations
No "while I'm here" additions
Ugly code is fine (for now)

# Minimal implementation - just enough to pass
def add_numbers(a, b):
    return a + b

3. REFACTOR: Clean Up While Tests Pass

Now improve the code:

Remove duplication
Improve naming
Extract methods/functions
Run tests after EVERY change
If tests fail, you broke something - revert

Test Pyramid

          ┌──────────────┐
         /   E2E Tests    \     ← Few, slow, fragile
        /   (10% of tests)  \
       ├────────────────────┤
      /  Integration Tests    \  ← Some, moderate speed
     /    (20% of tests)        \
    ├────────────────────────────┤
   /      Unit Tests               \ ← Many, fast, stable
  /       (70% of tests)             \
 └─────────────────────────────────────┘

Unit Tests (70%)

Test single function/method
No external dependencies (DB, network, filesystem)
Fast (< 100ms each)
Use mocks for dependencies

Integration Tests (20%)

Test component interactions
May use test databases
Moderate speed (< 5s each)
Verify real integrations work

E2E Tests (10%)

Test full user flows
Use real (or realistic) environment
Slow but comprehensive
Catch integration issues

Coverage Requirements

Coverage Type	Minimum	Target
Line coverage	70%	80%+
Branch coverage	60%	70%+
Critical paths	100%	100%
Edge cases	Explicit	Complete

Critical paths requiring 100% coverage:

Authentication/authorization
Payment processing
Data validation
Security-sensitive code

TDD Iteration Tracking

Track each TDD cycle (similar to Ralph technique):

Cycle	Test Written	Implementation	Refactoring	Notes
1	Basic happy path	Minimal impl	None yet	Foundation
2	Edge case: empty input	Handle empty	Extract validation
3	Edge case: invalid type	Type checking	Clean up

Test Patterns

Arrange-Act-Assert (AAA)

def test_user_can_login():
    # Arrange: Set up test data
    user = create_test_user(email="test@example.com", password="secret")

    # Act: Perform the action
    result = login(email="test@example.com", password="secret")

    # Assert: Verify the outcome
    assert result.success is True
    assert result.user.email == "test@example.com"

Given-When-Then (BDD style)

def test_user_login():
    # Given a registered user
    user = create_test_user()

    # When they login with correct credentials
    result = login(user.email, user.password)

    # Then they should be authenticated
    assert result.is_authenticated()

Mocking Strategy

When to mock:

External services (APIs, databases)
Time-dependent code
Random/non-deterministic behavior
Slow operations

When NOT to mock:

The unit under test
Simple value objects
Pure functions

# Good: Mock external dependency
@patch('myapp.services.external_api.fetch')
def test_fetches_data(mock_fetch):
    mock_fetch.return_value = {"data": "test"}
    result = process_external_data()
    assert result == "processed: test"

# Bad: Mocking the thing you're testing
# Don't do this - tests nothing!

Edge Cases Checklist

Always test:

Anti-Patterns

Anti-Pattern	Problem	Solution
Test after code	Doesn't drive design	Always test first
Testing implementation	Brittle tests	Test behavior only
No assertions	Test passes but proves nothing	Always assert
Ignoring failing tests	Technical debt	Fix or delete
Over-mocking	Tests pass, code broken	Mock boundaries only
Shared test state	Flaky tests	Isolate each test

Integration with /newapp

When /newapp invokes this skill:

Test framework setup included in project scaffold
Example tests created
Coverage configuration added
CI pipeline includes test stage

Integration

Works with:

project-scaffolding - Test structure in new projects
systematic-debugging - Tests catch regressions
/newapp command - Invokes during implementation phase
/fix command - Write failing test before fix

Adapted from TDD principles by Kent Beck Anthropic-recommended workflow for Claude Code

tdd-workflow