test-driven-development
Test-Driven Development (TDD)
Overview
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
When to Use
Always: New features, bug fixes, refactoring, behavior changes. Exceptions (ask your human partner): Throwaway prototypes, generated code, configuration files.
Thinking "skip TDD just this once"? Stop. That's rationalization.
The Iron Law
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions: Don't keep it as "reference". Don't "adapt" it while writing tests. Don't look at it. Delete means delete. Implement fresh from tests. Period.
Red-Green-Refactor
digraph tdd_cycle {
rankdir=LR;
red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
verify_red [label="Verify fails\ncorrectly", shape=diamond];
green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
verify_green [label="Verify passes\nAll green", shape=diamond];
refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
next [label="Next", shape=ellipse];
red -> verify_red;
verify_red -> green [label="yes"];
verify_red -> red [label="wrong\nfailure"];
green -> verify_green;
verify_green -> refactor [label="yes"];
verify_green -> green [label="no"];
refactor -> verify_green [label="stay\ngreen"];
verify_green -> next;
next -> red;
}
- RED — Write one minimal failing test. Verify it fails for the right reason.
- GREEN — Write simplest code to pass. No extras.
- REFACTOR — Clean up. Keep tests green. Don't add behavior.
- Repeat — Next failing test for next feature.
For detailed examples (good/bad) and step-by-step instructions, see references/tdd-detailed-process.md.
Red Flags — STOP and Start Over
- Code before test
- Test after implementation
- Test passes immediately
- Can't explain why test failed
- "I already manually tested it"
- "Tests after achieve the same purpose"
- "Keep as reference" or "adapt existing code"
- "Already spent X hours, deleting is wasteful"
- "TDD is dogmatic, I'm being pragmatic"
- "This is different because..."
All of these mean: Delete code. Start over with TDD.
For rationalization counters explaining WHY order matters, see references/why-order-matters.md. For good tests table, verification checklist, and when-stuck guide, see references/tdd-guidelines.md.
Final Rule
Production code -> test exists and failed first
Otherwise -> not TDD
No exceptions without your human partner's permission.
More from hjewkes/agent-skills
self-improve
Use when a session produced reusable insights, when the user says "learn from this", "remember this", or "improve yourself", or after completing a complex task where patterns were discovered
63md-render
Use when asked to render, preview, or view a markdown file in the browser. Triggers on "render markdown", "preview this", "show me this document", "open in browser".
21code-review
Unified code review system — dispatches the right review agents for the situation. Use when reviewing code for quality, bugs, compliance, or before merging.
17skills-management
Use when creating, finding, installing, reviewing, or managing Claude Code skills — covers skill authoring, discovery, conventions, and lifecycle management
14buildkite
Buildkite CI/CD integration. Use when the user needs to check build status, trigger builds, read build logs, debug failures, manage pipelines, or any Buildkite workflow. Triggers include "buildkite", "build", "pipeline", "CI", "deploy", "build log", "build failed".
14agent-browser
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
13