Troubleshooting Regression

Use this skill when the framework appears stale, stuck, or regressed and you need deterministic diagnosis plus fix verification.

When to Use

Claude debug sessions stall after spawning agents.
Hooks block expected actions unexpectedly.
Memory/search/token-saver enforcement appears inconsistent.
A regression needs a repeatable reproduction and validation run.

Iron Law

Do not declare a regression fixed without:

reproducible trigger prompt,
trace evidence from pnpm trace:query,
hook/tool evidence from debug logs,
targeted test pass for touched scope.

Workflow

Identify session and log source.
Run trace query first (pnpm trace:query --trace-id <traceId> --compact --since <ISO-8601> --limit 200).
Extract high-signal errors (excluding known MCP auth/startup noise).
Map each error to owning hook/module.
Patch minimal code path and add/update regression test.
Run targeted checks (tests + lint/format on changed files).
Re-run debug prompt and verify error class no longer reproduces.
Record learnings/issues in memory.

Evidence Model

Source of truth: C:\\Users\\<user>\\.claude\\debug\\*.txt
Trace source of truth: pnpm trace:query output for the same incident window
Filter: ignore external MCP transport/auth noise; keep framework/runtime errors
Error classes:
- routing/task lifecycle
- memory/search/token-saver guardrails
- hook contract/schema violations
- workflow phase/idempotency failures

Command Surface

Primary wrapper:

node .claude/skills/troubleshooting-regression/scripts/main.cjs --prompt "search the codebase for any issues or bugs"
pnpm trace:query --trace-id <traceId> --compact --since <ISO-8601> --limit 200

Optional direct log analysis:

node .claude/skills/troubleshooting-regression/scripts/main.cjs --log-path "C:\\Users\\<user>\\.claude\\debug\\<session>.txt"

Output Contract

ok: boolean
logPath: analyzed log path
findings[]: normalized findings with severity and owner file hints
nextActions[]: concrete fix/validation actions

Related Artifacts

Workflow: .claude/workflows/troubleshooting-regression-skill-workflow.md
Tool: .claude/tools/troubleshooting-regression/troubleshooting-regression.cjs
Command: .claude/commands/troubleshooting-regression.md

Examples

# Analyze latest log
node .claude/skills/troubleshooting-regression/scripts/main.cjs --mode quick

# Analyze specific log and fail when critical findings exist
node .claude/skills/troubleshooting-regression/scripts/main.cjs --log-path "<path>" --strict

Iron Laws

ALWAYS collect evidence (logs, trace IDs, error messages) before making any configuration changes
NEVER modify hook or routing configuration without first reproducing the failure deterministically
ALWAYS validate the fix by running the full regression test suite after each change
NEVER mark a regression resolved until the exact failure scenario passes end-to-end
ALWAYS document the root cause, fix, and validation evidence in issues.md for future reference

Anti-Patterns

Anti-Pattern	Why It Fails	Correct Approach
Making changes without reproducing failure	Can't verify the change fixed the right thing	Reproduce deterministically first, then fix
Fixing symptoms without root cause analysis	Problem recurs under different conditions	Use 5 Whys to trace to the root cause
Skipping regression suite after fix	Fix may break other functionality	Run full test suite after every change
Undocumented fixes	Same regression returns in future sessions	Record root cause, fix, and validation in issues.md
Resolving without end-to-end validation	Integration issues are missed	Validate the exact failure path end-to-end

Memory Protocol

Before starting:

cat .claude/context/memory/learnings.md

After completing:

Regression pattern -> .claude/context/memory/learnings.md
Open defect or risk -> .claude/context/memory/issues.md
New enforcement decision -> .claude/context/memory/decisions.md

ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.