tdd
Test-Driven Development
Write tests first, implement minimal code to pass, refactor systematically.
<when_to_use>
- New features with TDD methodology
- Complex business logic requiring coverage
- Critical paths: auth, payments, data integrity
- Bug fixes: reproduce with test, fix, verify
- Refactoring: ensure behavior preservation
- API design: tests define the interface
NOT for: exploratory coding, UI prototypes, static config, trivial glue code
</when_to_use>
Load the maintain-tasks skill for stage tracking. Advance through RED-GREEN-REFACTOR cycle.
| Stage | Trigger | activeForm |
|---|---|---|
| Red | Session start / cycle restart | "Writing failing test" |
| Green | Test written and failing | "Implementing code" |
| Refactor | Tests passing | "Refactoring code" |
| Verify | Refactor complete | "Verifying implementation" |
Task format:
- Write failing test for { feature }
- Implement { feature } to pass tests
- Refactor { aspect }
- Verify { what's being checked }
Workflow:
- Start: Create "Red" stage
in_progress - Transition: Mark current
completed, add nextin_progress - After each stage: Run tests before advancing
- Multiple cycles: Return to "Red" for next feature
Edge cases:
- Good existing tests: Start at "Refactor" after confirming pass
- Bug fix: Start at "Red" with failing test reproducing bug
- No regression: Tests must continue passing through all stages
RED --> GREEN --> REFACTOR --> RED --> ...
| | |
Test Impl Improve
Fails Passes Quality
Each cycle: 5-15 min. Longer = step too large, decompose.
Philosophy:
- Red-Green-Refactor as primary workflow
- Test quality over quantity - behavior, not implementation
- Incremental progress - small focused cycles
- Type safety throughout - tests as type-safe as production
<red_phase>
Write tests defining desired behavior before implementation exists.
Guidelines:
- 3-5 related tests fully specifying one feature
- Type system makes invalid states unrepresentable
- Each test = one specific behavior
- Run tests, verify fail for right reason
- Descriptive names forming sentences
TypeScript:
import { describe, test, expect } from 'bun:test'
describe('UserAuthentication', () => {
test('authenticates with valid credentials', async () => {
const result = await authenticate({ email: 'user@example.com', password: 'SecurePass123!' })
expect(result).toMatchObject({ type: 'success', user: expect.objectContaining({ email: 'user@example.com' }) })
})
test('rejects invalid credentials', async () => {
const result = await authenticate({ email: 'wrong@example.com', password: 'wrong' })
expect(result).toMatchObject({ type: 'error', code: 'INVALID_CREDENTIALS' })
})
test.todo('implements rate limiting after failed attempts')
})
Rust:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn authenticates_with_valid_credentials() {
let creds = Credentials { email: "user@example.com".into(), password: "SecurePass123!".into() };
assert!(matches!(authenticate(&creds), Ok(AuthResult::Success { .. })));
}
#[test]
fn rejects_invalid_credentials() {
let creds = Credentials { email: "wrong@example.com".into(), password: "wrong".into() };
assert!(matches!(authenticate(&creds), Err(AuthError::InvalidCredentials)));
}
}
Commit: test: add failing tests for [feature]
Transition: Mark "Red" completed, create "Green" in_progress
</red_phase>
<green_phase>
Implement minimum code to make tests pass.
Guidelines:
- Focus on passing tests, not perfect code
- Explicit types where aids clarity
- Straightforward solutions first
- Hardcode if passes test - refactor generalizes
- Run tests frequently
TypeScript:
type AuthResult = { type: 'success'; user: User } | { type: 'error'; code: string }
async function authenticate(creds: { email: string; password: string }): Promise<AuthResult> {
if (!creds.password) return { type: 'error', code: 'MISSING_PASSWORD' }
const user = await findUserByEmail(creds.email)
if (!user) return { type: 'error', code: 'INVALID_CREDENTIALS' }
const match = await comparePassword(creds.password, user.passwordHash)
if (!match) return { type: 'error', code: 'INVALID_CREDENTIALS' }
return { type: 'success', user }
}
Rust:
pub fn authenticate(creds: &Credentials) -> Result<AuthResult, AuthError> {
if creds.password.is_empty() { return Err(AuthError::MissingPassword); }
let user = find_user_by_email(&creds.email).ok_or(AuthError::InvalidCredentials)?;
if !compare_password(&creds.password, &user.password_hash) {
return Err(AuthError::InvalidCredentials);
}
Ok(AuthResult::Success { user })
}
Verify: bun test / cargo test
Commit: feat: implement [feature] to pass tests
Transition: Mark "Green" completed, create "Refactor" in_progress
</green_phase>
<refactor_phase>
Enhance code quality without changing behavior. Tests must continue passing.
Guidelines:
- Extract common patterns into well-named functions
- Apply SOLID principles where appropriate
- Improve types: discriminated unions, branded types
- No test behavior changes
- Run tests after each step
TypeScript:
// Extract validation
function validateCredentials(creds: { email: string; password: string }): AuthResult | null {
if (!creds.password) return { type: 'error', code: 'MISSING_PASSWORD' }
if (!isValidEmail(creds.email)) return { type: 'error', code: 'INVALID_EMAIL' }
return null
}
// Branded types for safety
type Email = string & { readonly __brand: 'Email' }
Rust:
// Extract validation
fn validate_credentials(creds: &Credentials) -> Result<(), AuthError> {
if creds.password.is_empty() { return Err(AuthError::MissingPassword); }
if !is_valid_email(&creds.email) { return Err(AuthError::InvalidEmail); }
Ok(())
}
// Newtype for safety
pub struct Email(String);
Verify: bun test / cargo test
Commit: refactor: [improvement description]
Transition: Mark "Refactor" completed, create "Verify" in_progress
Final: Run full suite. Mark "Verify" completed when all checks pass.
</refactor_phase>
Follow project conventions, defaulting to:
TypeScript/Bun:
src/{module}/{name}.ts # Implementation
src/{module}/{name}.test.ts # Unit tests colocated
src/{module}/__fixtures__/ # Test data
tests/integration/ # Integration tests
tests/e2e/ # End-to-end tests
Rust:
src/{module}/mod.rs # #[cfg(test)] mod tests { ... }
tests/integration/ # Integration tests
tests/fixtures/ # Test data
| Metric | Target |
|---|---|
| Line coverage | >=80% (90% critical paths) |
| Mutation score | >=75% |
| Unit test time | <5s |
Test characteristics:
- Single clear assertion per test
- No execution order dependencies
- Descriptive names forming sentences
- Behavior focus, not implementation
Smells to avoid:
- Setup longer than test
- Multiple unrelated assertions
- Coupling to implementation details
- Flaky tests
See quality-metrics.md for coverage and mutation testing details.
<bug_fixes>
TDD workflow for bugs:
- Write failing test reproducing bug (Start "Red"
in_progress) - Verify fails for right reason
- Fix with minimal code (Transition to "Green")
- Verify passes, all others still pass
- Refactor if needed (Transition to "Refactor" or skip to "Verify")
- Commit:
fix: [bug description] with test coverage
Example:
// 1. Failing test
test('handles division by zero gracefully', () => {
expect(divide(10, 0)).toMatchObject({ type: 'error', code: 'DIVISION_BY_ZERO' })
})
// 3. Fix
function divide(a: number, b: number): Result {
if (b === 0) return { type: 'error', code: 'DIVISION_BY_ZERO' }
return { type: 'success', value: a / b }
}
</bug_fixes>
ALWAYS:
- Track progress with Tasks (load maintain-tasks skill)
- Write tests before implementation (RED first)
- Run tests after each stage
- Verify tests fail for right reason in RED
- Keep cycles 5-15 min max
- Descriptive test names forming sentences
- Test behavior, not implementation
- Each test = one reason to fail
NEVER:
- Skip to implementation without tests
- Change test behavior during refactoring
- Test implementation details or private methods
- Allow tests to depend on execution order
- Write flaky tests
- Mark stage complete without running tests
- Multiple unrelated assertions per test
<quick_reference>
# TypeScript/Bun
bun test # Run all tests
bun test --watch # Watch mode
bun test --coverage # Coverage report
bun test --only # Run only .only tests
bun x stryker run # Mutation testing
# Rust
cargo test # Run all tests
cargo test --test NAME # Specific integration test
cargo tarpaulin # Coverage report
cargo mutants # Mutation testing
cargo test -- --nocapture # Show println! output
</quick_reference>
- test-patterns.md - Discriminated unions, builders, mocking, parameterized tests, async patterns for TypeScript and Rust
- quality-metrics.md - Coverage analysis, mutation testing setup, CI integration, thresholds
- feature-implementation.md - Full TDD session walkthrough
- bug-fix.md - TDD workflow for bug fixes
More from outfitter-dev/agents
codebase-recon
This skill should be used when analyzing codebases, understanding architecture, or when "analyze", "investigate", "explore code", or "understand architecture" are mentioned.
92graphite-stacks
This skill should be used when the user asks to "create a stack", "submit stacked PRs", "gt submit", "gt create", "reorganize branches", "fix stack corruption", or mentions Graphite, stacked PRs, gt commands, or trunk-based development workflows.
76code-review
This skill should be used when reviewing code before commit, conducting quality gates, or when "review", "fresh eyes", "pre-commit review", or "quality gate" are mentioned.
34hono-dev
This skill should be used when building APIs with Hono, using hc client, implementing OpenAPI, or when "Hono", "RPC", or "type-safe API" are mentioned.
28software-craft
This skill should be used when making design decisions, evaluating trade-offs, assessing code quality, or when "engineering judgment" or "code quality" are mentioned.
28subagents
This skill should be used when coordinating agents, delegating tasks to specialists, or when "dispatch agents", "which agent", or "multi-agent" are mentioned.
25