debugging
Debugging optimizes for certainty, not speed. Speed is a byproduct of certainty. Debugging always starts by ruling out trivial causes. If not sufficient, there are two different paths to pursue: Fast Path and Normal Path.
Quick Wins (30 seconds max)
First check the no brainers:
- Typo / wrong variable name
- Missing import / uninstalled dependency
- Stale cache / needs rebuild
- Wrong branch / uncommitted changes
- Environment variable missing / misconfigured
- Runtime / compiler version mismatch
- Unexpected input shape / encoding / locale
If any of these is the cause → fix directly, no protocol needed.
Fast Path Debugging
Note: Regular tasks have their own FAST PATH (Rule 4).
Eligible when ALL true:
- Bug conditions are obvious and need no report intake.
- Stacktrace/error points directly to fault location (no inference needed)
- Single hypothesis survives first inspection (no competing explanations)
- Fix is self-evident from the bug (seeing the problem IS seeing the solution)
- Localized (single file, single function)
- Deterministic (not flaky)
Constraint: No refactor, no cleanup, no opportunistic improvements. Fix the bug and stop.
Format:
Mode: Debug (Fast)
Bug: [one-line: location + cause]
Fix: [one-line change]
Verify: [test/command] → [expected outcome]
Proceed (P)?
The above verification should be successful for the bug to be considered fixed.
Escalation: If fix fails on first attempt → undo the fix, then escalate to normal path. No second tries in fast mode. If new information contradicts the initial hypothesis → abandon Fast Path immediately. Don't bend the hypothesis to fit.
Normal Path Debugging
Bug Report Intake
Before qualification, ground the report:
- Where? (local / CI / staging / prod)
- What branch/version?
- Since when? (always / recent / after X)
- Who's affected? (just me / others / unknown)
- What was already tried?
If critical context is missing, ask before proceeding.
Mode Selection
When debugging detected, ask: "Debugging in [collaboration mode] because [reason]. Override?"
Defaults: Syntax/type → Autonomous. Logic → User Duck. Architectural → Pairing.
Escalation: 2 failed Autonomous → User Duck. 3 stalled iterations → Stop & Document (see Structured Failure Report).
Bug Qualification
Convergence Goal: A reproducible trigger + a traced failure path. Iterate until both are stable.
Failure Path: Trace trigger → state → symptom with concrete identifiers (values, line numbers, states).
- Distinguish: Observed (logs/debugger) vs Inferred (code reading)
- Multiple errors: identify primary vs cascade
- Calibration: ❌ "The validation fails" ✓ "score=None at line 47 → ValueError; set by parse_input(line 32) receiving empty string"
Reproducibility: Capture in failing test (ideal — this becomes the regression test) or documented manual steps (acceptable). Reduce to minimal case before analysis — fewer lines reveal more. If sporadic: narrow conditions or instrument. Do not propose fixes for untriggerable bugs.
Counterfactual Check: What similar inputs/states do NOT trigger the bug?
- Identify the boundary: "fails with empty string but not whitespace" reveals more than "fails with empty string"
- Vary one dimension at a time to isolate the trigger
- Document both positive (triggers) and negative (doesn't trigger) cases
Temporal Bug Patterns: If timing-related, identify the category:
- Race condition: multiple actors, shared state, non-deterministic outcome
- Async ordering: operations complete in unexpected sequence
- Cache staleness: read returns outdated value
- Eventual consistency: state not yet propagated
Instrumentation for temporal bugs:
- Add timestamps/sequence IDs to logs
- Inject artificial delays to widen race windows
- Force cache misses to isolate caching effects
- Check replica lag / propagation status
These bugs may resist minimal reproduction — document the timing conditions even if not fully deterministic.
Qualification Summary:
Bug Profile: [regression|novel] [deterministic|flaky] [reproducible|sporadic] [tested|untested]
Optional: [technical|logical] [observable|opaque] [localized|distributed] [stateless|stateful]
If regression: consider git bisect (known-good SHA, fast test command; moves HEAD — approval required).
Signal: [symptom] vs [expected] — Observability: [stacktrace|logs|metrics|user report|silent]
Impact: [data loss | security exposure | user-visible | silent corruption | degraded performance | cosmetic]
Reproduction: [steps] — Feedback loop: [fast <10s | slow | manual]
Failure Path: [traced path, observed vs inferred]
Missing Evidence: [none | logs: ... | access: ... | repro: ... | spec: ... | metrics: ... | permissions: ...]
User context needed? [N / Y: specific question]
Proceed to analysis? (Y/N)
Pattern Analysis
Before narrowing, establish a reference point:
- Find working analogues in the codebase (similar functionality that works)
- Study relevant portions — focus on the code path that differs from your bug
- Catalog distinctions between working and broken versions
- Note dependencies and environmental differences
If no analogue exists, note explicitly — you're debugging novel code without a known-good reference.
Skip this phase when: the bug is in isolated logic with no comparable pattern, or the failure path already points to a clear contract violation.
Analysis (Divide and Conquer)
Debugging is a narrowing search, not guess-and-check.
Observability check: Before narrowing, ensure you can observe the relevant state transitions. If not, add instrumentation (logging, tracing, metrics) before forming hypotheses. Debugging blind wastes cycles.
Loop until atomic:
- State the problem space — check for biases:
- Anchoring: Am I overweighting my first hypothesis?
- Stacktrace trust: This shows where it failed, not necessarily where the bug is
- "It worked before": Maybe the bug was always there, just not triggered
- Correlation ≠ cause: Are these events actually related, or just coincident?
- List 2-3 hypotheses if possible. Pick one with rationale, favoring bisection over single-point tests
- Implement verification (typically a test) — ask: "What evidence would falsify this hypothesis fastest?" Add instrumentation if needed to observe the outcome.
- Narrow to surviving subspace. If ambiguous, try different bisection point.
- Exit condition: If narrowing reveals expected behavior → False Positive path: document why, assess doc/UX need, close as "Not a Bug"
Analysis Summary: Before Root Cause Identification, summarize the narrowing chain.
Root Cause Gate
Where → Why: Pull the string until you reach an actionable flaw — something that should have been different and can be corrected.
Before proposing fix: what evidence would prove this root cause wrong?
Root Cause Gate:
Category: [wrong assumption | missing guard | violated contract | design gap | external dependency | timing/race | data corruption | observability gap | other: ...]
Root Cause: [one-line actionable flaw]
Chain: [root cause] → [symptoms] → [impacts]
Falsifiable by: [evidence that would prove this wrong]
Confidence: [high: would bet on it | medium: best hypothesis | low: educated guess]
Proceed to Resolution (P)?
Resolution
- Verify: Re-run repro to confirm bug still exists
- Fix: Minimum scope, root cause not symptom. Compact Approval request.
- If bug traced to external dependency: present options (workaround / upstream report / version change) before proceeding.
- Protect: Repro test → regression suite (TDD workflow):
- If manual repro only: write failing test now
- Verify test fails for the right reason
- Implement fix → verify test passes
- Run related tests for regressions
- Search codebase for similar patterns
- Post-fix validation: check adjacent code paths, inverse conditions, and previously working edge cases
- Confidence Gate: Explain why this fix (not something else) addresses the root cause stated in the Gate.
- If explanation doesn't hold → Low confidence regardless of test results.
- High → proceed to close
- Medium → require additional validation (extended soak, peer review, or monitoring period)
- Low → do not close; document as "mitigated, not resolved", record in
TECH_DEBT.md, schedule follow-up
- Close: Summary with lessons learned
Post-Resolution Reflection:
- Why did this bug escape earlier detection? (missing test, untested edge case, insufficient logging, unclear contract)
- What signal could have caught it sooner? (test, assertion, metric, log pattern)
- Is this a one-off or a pattern worth addressing systemically?
Stop-on-Repeat: Same fix twice → explain why it will work this time.
Struggle Report (3 stalled attempts):
Attempted: [approach] — failed because [reason]
Dead Ends: [do not revisit]
Evidence Gathered: [what we now know that we didn't before]
Remaining Hypotheses: [untested]
Blockers: [what prevents progress]
Next Steps: [escalation or alternative]