tdd-workflow
SKILL.md
Test-Driven Development Workflow
Core Philosophy
TDD is a design discipline, not just a testing technique. Writing tests first forces you to think about the interface before the implementation, producing code that is inherently testable, loosely coupled, and driven by actual requirements.
The Red-Green-Refactor Loop
Step 1: RED — Write a Failing Test
- Identify the smallest piece of behavior to implement next.
- Write a test that describes that behavior from the caller's perspective.
- Run the test suite. The new test MUST fail.
- If the test passes without writing new code, either the behavior already exists or the test is wrong.
Checklist before moving to GREEN:
- The test describes WHAT, not HOW.
- The test name reads like a specification (e.g.,
it should return 404 when resource not found). - The test fails for the right reason (expected assertion failure, not a syntax error or import error).
- The test is isolated — it does not depend on other tests or external state.
Step 2: GREEN — Write the Minimal Implementation
- Write the simplest code that makes the failing test pass.
- It is acceptable — even encouraged — to hardcode values, use naive algorithms, or write "ugly" code at this stage.
- Do NOT add logic that is not required by a failing test.
- Run the full test suite. ALL tests must pass.
The "simplest thing that could work" principle:
- If one test expects
return 42, writereturn 42. Do not write a general formula yet. - If two tests expect different outputs, write an
ifstatement. Do not write a loop yet. - If three tests show a pattern, NOW consider a general algorithm.
Checklist before moving to REFACTOR:
- The new test passes.
- All previously passing tests still pass.
- No code was added beyond what the tests demand.
Step 3: REFACTOR — Improve the Design
- Look for duplication in both production code and test code.
- Extract methods, rename variables, simplify conditionals.
- Apply design patterns only when the code asks for them (three or more instances).
- Run the full test suite after every change. All tests must stay green.
Refactoring targets:
- Duplicate code extracted into shared functions.
- Variable and method names clearly express intent.
- No method exceeds 10-15 lines.
- No function has more than 3 parameters.
- Test code is also clean and readable.
Step 4: REPEAT
Return to Step 1 with the next smallest behavior.
Test Naming Conventions
Use descriptive names that serve as documentation:
// Pattern: [unit]_[scenario]_[expected result]
test_calculateTotal_withEmptyCart_returnsZero
test_calculateTotal_withSingleItem_returnsItemPrice
test_calculateTotal_withDiscount_appliesDiscountToSubtotal
// BDD-style
describe("calculateTotal") {
it("should return zero for an empty cart")
it("should return the item price for a single item")
it("should apply discount to subtotal")
}
// Given-When-Then
given_emptyCart_when_calculateTotal_then_returnsZero
Rules:
- Never use "test1", "test2", or other meaningless names.
- The test name should tell you what broke without reading the test body.
- Group related tests under a
describeor test class.
Choosing What to Test Next
Start with the degenerate cases:
- Null / empty / zero inputs.
- Single element / simplest valid input.
- Typical valid inputs.
- Boundary values and edge cases.
- Error conditions and invalid inputs.
Prioritization:
- Begin with the behavior that drives the most architectural decisions.
- Defer I/O, persistence, and external service tests until core logic is solid.
- Test the happy path first, then edge cases, then error paths.
TDD for Different Test Types
Unit Tests (most common in TDD)
RED: Write a test for a single function or method.
GREEN: Implement just that function.
REFACTOR: Clean up the function and its test.
Cycle time: 1-5 minutes.
Integration Tests
RED: Write a test that exercises two or more components together.
GREEN: Wire the components and implement any missing glue code.
REFACTOR: Extract shared setup, clean interfaces between components.
Cycle time: 5-15 minutes.
API / HTTP Tests
RED: Write a test that sends an HTTP request and asserts on status + body.
GREEN: Implement the route handler with minimal logic.
REFACTOR: Extract validation, business logic, and serialization into separate layers.
Cycle time: 5-20 minutes.
Example: Building a URL Shortener (Unit Level)
# RED: First test — empty slug
def test_shorten_rejects_empty_url():
with pytest.raises(ValueError):
shorten("")
# GREEN: Minimal implementation
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
pass # nothing else needed yet
# RED: Second test — returns a short code
def test_shorten_returns_six_char_code():
result = shorten("https://example.com")
assert len(result) == 6
# GREEN: Hardcode, then generalize
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return url[:6] # naive but passing
# RED: Third test — codes are unique
def test_shorten_returns_unique_codes():
a = shorten("https://example.com/a")
b = shorten("https://example.com/b")
assert a != b
# GREEN: Now we need real logic
import hashlib
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return hashlib.md5(url.encode()).hexdigest()[:6]
# REFACTOR: Extract hashing, add type hints, rename for clarity
Handling Untested Legacy Code
When adding features to code without tests:
- Characterization tests first. Write tests that document what the code currently does, not what it should do. Lock down existing behavior.
- Find the seam. Identify a point where you can intercept behavior (dependency injection, method override, function parameter).
- Apply TDD to the new feature. Write failing tests for the new behavior, implement it, then refactor.
- Expand the safety net. Gradually add tests around the touched code.
The Mikado Method for legacy TDD:
- Try a naive change.
- If tests break, note what broke and revert.
- Fix the prerequisites first (add missing tests, extract dependencies).
- Retry the original change.
Common TDD Pitfalls
Writing too large a test
- Symptom: The GREEN step takes more than 15 minutes.
- Fix: Break the test into smaller behavioral increments.
Testing implementation details
- Symptom: Tests break when you refactor, even though behavior is unchanged.
- Fix: Test inputs and outputs, not internal method calls or data structures.
Skipping the RED step
- Symptom: You write code and tests at the same time.
- Fix: Discipline. Always see the test fail first. A test you have never seen fail is a test you cannot trust.
Skipping the REFACTOR step
- Symptom: Code works but is messy, duplicated, or hard to read.
- Fix: Set a timer. After every GREEN, spend at least 2 minutes looking for cleanup opportunities.
Gold-plating during GREEN
- Symptom: You add error handling, logging, or optimizations not required by any test.
- Fix: If no test demands it, delete it. You can add it later when a test asks for it.
Fragile test fixtures
- Symptom: Many tests break when a shared fixture changes.
- Fix: Use factory functions or builders. Each test should set up only what it needs.
Test interdependence
- Symptom: Tests pass in one order but fail in another.
- Fix: Each test must set up and tear down its own state. Run tests in random order to detect this.
TDD Decision Framework
Is the behavior well-understood?
YES -> Classic TDD (test-first)
NO -> Spike first (throwaway prototype), then TDD the real implementation
Is the code interacting with external systems?
YES -> Write a contract/interface test, then use a fake/stub for unit TDD
NO -> Pure function TDD (easiest case)
Is the code algorithmically complex?
YES -> Start with simple examples, build up with property-based tests
NO -> Standard example-based TDD
Are you fixing a bug?
YES -> Write a test that reproduces the bug FIRST, then fix it
NO -> Normal TDD cycle
Key Metrics
- Cycle time: Each red-green-refactor should take 1-15 minutes. Longer cycles mean the step is too big.
- Test count growth: Roughly 1 test per 5-15 lines of production code.
- Refactor frequency: You should refactor at least every 3 cycles.
- All tests passing: At the end of every GREEN and REFACTOR step. Never commit with failing tests.
Summary: The Three Laws of TDD (Robert C. Martin)
- You may not write production code until you have written a failing test.
- You may not write more of a test than is sufficient to fail (and not compiling counts as failing).
- You may not write more production code than is sufficient to pass the currently failing test.
Weekly Installs
2
Repository
claude-code-com…esourcesGitHub Stars
5
First Seen
Feb 26, 2026
Security Audits
Installed on
opencode2
gemini-cli2
claude-code2
github-copilot2
codex2
kimi-cli2