skills/franciscosanchezn/easyfactu-es/test-driven-refactoring

test-driven-refactoring

SKILL.md

Test-Driven Refactoring

Safely refactor code by establishing test coverage first, making incremental changes, and continuously verifying behavior is preserved.

When to Use This Skill

  • Refactoring legacy code without adequate tests
  • Improving code structure while preserving behavior
  • Adding tests to existing code before modification
  • Migrating or rewriting systems incrementally
  • Reducing risk when changing critical code paths

Core Principles

The Refactoring Safety Net

┌─────────────────────────────────────────────────────────────┐
│                    REFACTORING SAFETY                        │
├─────────────────────────────────────────────────────────────┤
│  1. Tests MUST pass before starting                         │
│  2. Tests MUST pass after every change                      │
│  3. If tests fail → REVERT immediately                      │
│  4. Never change tests and code in the same commit          │
└─────────────────────────────────────────────────────────────┘

Two Hats Rule

"When refactoring, wear only one hat at a time"

Hat Activity Tests
🎩 Refactoring Change structure, preserve behavior Tests unchanged, must pass
🧢 Adding Features Add new behavior Add new tests

Never mix these activities in the same commit.

Test-Driven Refactoring Workflow

Phase 1: Characterization Testing

Before refactoring, capture current behavior with characterization tests.

# Characterization test captures ACTUAL behavior (even if wrong)
def test_calculate_discount_characterization():
    """Characterization test - documents current behavior."""
    # ARRANGE: Set up the scenario
    order = Order(total=150, customer=Customer(is_premium=True))

    # ACT: Call the code under test
    result = calculate_discount(order)

    # ASSERT: Document what it actually does
    # Note: We're not asserting what it SHOULD do,
    # but what it CURRENTLY does
    assert result == 22.5  # 15% discount observed

Writing Characterization Tests

  1. Identify the code to refactor
  2. Write a test that calls it
  3. Use a placeholder assertion (assert result == "PLACEHOLDER")
  4. Run the test and observe the actual output
  5. Update assertion with actual value
  6. Repeat for different inputs and edge cases
def test_discover_actual_behavior():
    """Step-by-step characterization test creation."""
    # Step 1: Call the code
    result = mystery_function(42, "test")

    # Step 2: First run with placeholder to see output
    # assert result == "PLACEHOLDER"  # Will fail, showing actual value

    # Step 3: Update with observed value
    assert result == {"value": 42, "name": "test", "processed": True}

Phase 2: Coverage Analysis

Ensure adequate coverage before refactoring.

# Check current coverage
uv run pytest --cov=src/module_to_refactor --cov-report=term-missing

# Generate HTML report for detailed view
uv run pytest --cov=src/module_to_refactor --cov-report=html

# Target: 80%+ coverage on code being refactored

Coverage Targets

Coverage Level Risk Assessment
90%+ ✅ Low risk - safe to refactor
80-89% ⚠️ Moderate risk - add more tests for uncovered paths
70-79% 🟠 Higher risk - significant gaps to fill
<70% 🔴 High risk - write characterization tests first

Phase 3: Incremental Refactoring

Make small, safe changes with continuous verification.

┌──────────────────────────────────────────────────────────┐
│                  REFACTORING LOOP                         │
│                                                           │
│    ┌─────────┐     ┌─────────┐     ┌─────────┐          │
│    │  Make   │────▶│  Run    │────▶│ Tests   │          │
│    │  Small  │     │  Tests  │     │ Pass?   │          │
│    │ Change  │     │         │     │         │          │
│    └─────────┘     └─────────┘     └────┬────┘          │
│         ▲                               │                │
│         │                    ┌──────────┴──────────┐     │
│         │                    │                     │     │
│         │                   YES                   NO     │
│         │                    │                     │     │
│         │                    ▼                     ▼     │
│    ┌─────────┐          ┌─────────┐          ┌─────────┐│
│    │  Next   │◀─────────│ Commit  │          │ REVERT  ││
│    │ Change  │          │         │          │         ││
│    └─────────┘          └─────────┘          └─────────┘│
│                                                          │
└──────────────────────────────────────────────────────────┘

Refactoring Commands

# Before each change
uv run pytest -v  # Verify green

# After each change
uv run pytest -v  # Must stay green

# If tests pass
git add -A && git commit -m "refactor: extract validation logic"

# If tests fail
git checkout -- .  # Revert all changes

Phase 4: Verification

Confirm refactoring success without behavior changes.

# Full test suite
uv run pytest -v

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Coverage comparison (should not decrease)
uv run pytest --cov=src/ --cov-report=term-missing

Characterization Test Patterns

Pattern 1: Golden Master Testing

Capture complex output as a "golden master" to compare against.

import json
from pathlib import Path

def test_report_generation_golden_master():
    """Compare output against saved golden master."""
    # Generate current output
    report = generate_report(sample_data)
    current_output = json.dumps(report, indent=2, sort_keys=True)

    golden_path = Path("tests/golden/report_output.json")

    if not golden_path.exists():
        # First run: create the golden master
        golden_path.parent.mkdir(parents=True, exist_ok=True)
        golden_path.write_text(current_output)
        pytest.skip("Golden master created - run again to verify")

    # Compare against golden master
    expected = golden_path.read_text()
    assert current_output == expected, "Output differs from golden master"

Pattern 2: Approval Testing

Use approval testing libraries for complex outputs.

from approvaltests import verify, verify_as_json

def test_complex_transformation():
    """Use approval testing for complex outputs."""
    result = complex_transformation(input_data)

    # First run creates .approved file
    # Review and rename to approve
    verify_as_json(result)

Pattern 3: Parameterized Characterization

Capture multiple scenarios efficiently.

import pytest

# Discovered behaviors from production/logs
OBSERVED_BEHAVIORS = [
    # (input, expected_output)
    ({"amount": 100, "type": "A"}, 95.0),
    ({"amount": 100, "type": "B"}, 90.0),
    ({"amount": 0, "type": "A"}, 0.0),
    ({"amount": -50, "type": "A"}, -47.5),  # Note: negative allowed!
]

@pytest.mark.parametrize("input_data,expected", OBSERVED_BEHAVIORS)
def test_calculate_characterization(input_data, expected):
    """Characterization tests from observed behavior."""
    result = calculate(input_data["amount"], input_data["type"])
    assert result == expected

Pattern 4: Seam Testing

Test at "seams" - points where you can alter behavior for testing.

# Original tightly coupled code
class OrderProcessor:
    def process(self, order):
        # Hard to test - creates its own dependencies
        db = Database()
        email = EmailService()
        ...

# Refactored with seams
class OrderProcessor:
    def __init__(self, db=None, email=None):
        # Seam: can inject test doubles
        self.db = db or Database()
        self.email = email or EmailService()

    def process(self, order):
        ...

# Test using the seam
def test_order_processing():
    mock_db = Mock()
    mock_email = Mock()
    processor = OrderProcessor(db=mock_db, email=mock_email)

    processor.process(sample_order)

    mock_db.save.assert_called_once()

Safe Refactoring Techniques

Technique 1: Parallel Change (Expand-Contract)

Add new implementation alongside old, switch over gradually.

# Step 1: Add new implementation alongside old
class Calculator:
    def calculate_legacy(self, x, y):
        """Original implementation."""
        return x + y + 1  # Bug: adds 1

    def calculate_new(self, x, y):
        """New correct implementation."""
        return x + y

    def calculate(self, x, y):
        """Router - can switch implementations."""
        # Step 2: Run both, compare results (shadow mode)
        old_result = self.calculate_legacy(x, y)
        new_result = self.calculate_new(x, y)

        if old_result != new_result:
            logger.warning(f"Results differ: {old_result} vs {new_result}")

        # Step 3: Return old result while validating
        return old_result

        # Step 4: After validation, switch to new
        # return new_result

Technique 2: Branch by Abstraction

Introduce abstraction layer to enable safe switching.

# Step 1: Create abstraction
from abc import ABC, abstractmethod

class PaymentProcessor(ABC):
    @abstractmethod
    def process(self, amount: float) -> bool:
        ...

# Step 2: Wrap legacy code
class LegacyPaymentProcessor(PaymentProcessor):
    def __init__(self):
        self._legacy = OldPaymentSystem()

    def process(self, amount: float) -> bool:
        return self._legacy.do_payment(amount)

# Step 3: Create new implementation
class ModernPaymentProcessor(PaymentProcessor):
    def process(self, amount: float) -> bool:
        # New implementation
        ...

# Step 4: Switch via configuration
def get_payment_processor() -> PaymentProcessor:
    if config.use_new_payment:
        return ModernPaymentProcessor()
    return LegacyPaymentProcessor()

Technique 3: Strangler Fig Pattern

Gradually replace legacy system by routing to new code.

class LegacyRouter:
    """Routes requests to legacy or new implementation."""

    def __init__(self):
        self.legacy = LegacySystem()
        self.new = NewSystem()

        # Migration configuration
        self.migrated_routes = {
            "get_user": True,      # Migrated
            "update_user": True,   # Migrated
            "delete_user": False,  # Still legacy
            "create_order": False, # Still legacy
        }

    def handle(self, route: str, **kwargs):
        if self.migrated_routes.get(route, False):
            return self.new.handle(route, **kwargs)
        return self.legacy.handle(route, **kwargs)

Test Fixtures for Refactoring

Fixture: Snapshot Current State

import pytest
from dataclasses import asdict

@pytest.fixture
def snapshot_state():
    """Capture state before and after operations."""
    snapshots = {"before": None, "after": None}

    def capture(label, obj):
        if hasattr(obj, "__dict__"):
            snapshots[label] = obj.__dict__.copy()
        elif hasattr(obj, "asdict"):
            snapshots[label] = asdict(obj)
        else:
            snapshots[label] = obj

    yield capture

    # Can assert state unchanged if needed
    return snapshots

Fixture: Behavior Recorder

@pytest.fixture
def behavior_recorder():
    """Record function calls and returns for characterization."""
    recordings = []

    def record(func):
        def wrapper(*args, **kwargs):
            result = func(*args, **kwargs)
            recordings.append({
                "function": func.__name__,
                "args": args,
                "kwargs": kwargs,
                "result": result,
            })
            return result
        return wrapper

    record.get_recordings = lambda: recordings
    return record

Checklist: Before Refactoring

## Pre-Refactoring Checklist

### Test Coverage
- [ ] Coverage report generated for target code
- [ ] Coverage >= 80% on code being refactored
- [ ] All critical paths have tests
- [ ] Edge cases identified and tested

### Characterization Tests
- [ ] Current behavior captured in tests
- [ ] Tests document actual behavior (not intended)
- [ ] Golden masters created for complex outputs
- [ ] Parameterized tests for multiple scenarios

### Environment
- [ ] All tests passing (green baseline)
- [ ] Clean git working directory
- [ ] Feature branch created
- [ ] CI/CD pipeline configured

### Plan
- [ ] Refactoring steps identified
- [ ] Each step is small and reversible
- [ ] Commit points planned
- [ ] Rollback strategy defined

Checklist: During Refactoring

## Refactoring Session Checklist

### Each Change
- [ ] Tests still passing before change
- [ ] Made ONE small change
- [ ] Ran tests immediately after
- [ ] Tests still passing
- [ ] Committed with descriptive message

### If Tests Fail
- [ ] STOPPED immediately
- [ ] Reverted changes (git checkout -- .)
- [ ] Analyzed what went wrong
- [ ] Planned smaller step

### Periodically
- [ ] Type checking passing (mypy)
- [ ] Linting passing (ruff)
- [ ] Coverage not decreased

Resources

Guidelines

  • Always establish test coverage before refactoring
  • Make the smallest possible change that improves the code
  • Run tests after every change, no exceptions
  • If tests fail, revert immediately—don't debug while refactoring
  • Commit after each successful refactoring step
  • Never change behavior and structure in the same commit
  • Use characterization tests to document, not validate, behavior
  • Preserve behavior first, then fix bugs in separate commits
Weekly Installs
1
First Seen
12 days ago
Installed on
mcpjam1
claude-code1
junie1
windsurf1
zencoder1
crush1