investigate
Investigate — Root Cause Analysis Engine
Systematic deep investigation protocol. Finds the REAL cause, not the surface symptom.
Core principle: Never fix what you don't understand. Every fix must trace to a proven root cause with evidence.
Protocol
Process every /investigate invocation through these 8 phases in strict order. Never skip a phase. Never jump to Phase 7 (FIX) without completing Phases 1-6.
Phase 1: OBSERVE — Gather All Symptoms
Collect every observable fact before forming any theory.
- Parse
$ARGUMENTSas the symptom description - Ask the user for additional context if the description is vague — use AskUserQuestion:
- What's the expected behavior vs actual behavior?
- When did it start? What changed recently?
- Is it consistent or intermittent?
- Any error messages, logs, or stack traces?
- Check memory files for known pitfalls related to this area:
- Read MEMORY.md and any topic-specific memory files
- Check CLAUDE.md for documented patterns
- Gather environmental context:
- Run
git log --oneline -20to see recent changes - Run
git diff --stat HEAD~5to see what files changed recently - Check for any failing tests with the project's test runner
- Run
Output: A symptom report listing every observable fact, recent changes, and any relevant memory entries.
Gate: Do NOT theorize yet. Only facts.
Phase 2: REPRODUCE — Confirm the Issue
An issue you cannot reproduce is an issue you cannot prove you fixed.
- Identify the shortest path to trigger the symptom:
- Run existing tests that cover the affected area
- If no test exists, attempt manual reproduction via Bash
- For UI issues, check if a Playwright MCP sequence can reproduce it
- Document the reproduction steps precisely
- If the issue is intermittent:
- Flag it as potentially timing-dependent (race condition, async, state)
- Look for concurrent access, shared mutable state, missing locks/guards
- Check for dependency on external state (network, filesystem, database)
- If the issue cannot be reproduced:
- Shift to forensic investigation (logs, git history, code review)
- Do NOT skip remaining phases — proceed with available evidence
Output: Reproduction steps, or explicit documentation of why reproduction failed.
Gate: Issue confirmed (or forensic mode declared). Proceed.
Phase 3: TRACE — Follow the Execution Path
Start from the symptom and trace backward to the origin.
- Locate the symptom — find the exact file and line where the error occurs:
- Use Grep for error messages, exception types, log strings
- Use Explore agent for broad searches if the location is unclear
- Trace the call chain — read every file in the execution path:
- From error site → caller → caller's caller → entry point
- Read each file fully with Read tool — do NOT skim
- Document the complete flow: input → transform → output
- Trace the data flow — follow the data that caused the error:
- What value caused the crash? Where did it come from?
- Trace the value backward: variable → assignment → source → input
- Map dependencies — what else touches this code path:
- Use Grep to find all callers of the failing function
- Check for shared state, singletons, global variables
- Look for recent changes in dependencies with
git log --oneline -- <file>
- Check git forensics — when was the problem introduced:
git log --oneline -- <affected-files>— who changed these files and when?git blame <file>on the suspicious lines — what commit introduced them?- If a clear suspect commit is found, read its full diff
Output: Complete execution trace with file paths and line numbers. Data flow map. Git timeline.
Gate: The full code path from entry point to symptom is mapped and understood.
Phase 4: HYPOTHESIZE — Deep Reasoning with 5 Whys
This phase MUST use the sequential-thinking MCP server for structured multi-step reasoning.
- Start the sequential-thinking chain with the symptom and all evidence from Phases 1-3
- Apply the 5 Whys method — for each answer, ask "but why does THAT happen?":
Symptom: App crashes when tapping a document Why 1: DocumentDetailView accesses a deleted NSManagedObject Why 2: The object was deleted from Core Data while the view held a reference Why 3: context.delete() was called from a background operation Why 4: The background sync didn't check if the view was still displaying the object Why 5: There's no soft-delete pattern — objects are hard-deleted immediately ROOT CAUSE: Missing soft-delete guard in the sync pipeline - Generate at least 2 competing hypotheses — don't lock on the first theory:
- Categorize each by type: Code Logic | Data State | Timing/Race | Environment | Dependency | Configuration
- For each hypothesis, define what evidence would prove or disprove it
- Use branching in sequential-thinking to explore alternative explanations:
branchFromThought: 3, branchId: "alternative-cause" - Rank hypotheses by likelihood based on available evidence
Output: Ranked list of hypotheses with evidence requirements for each.
Gate: At least 2 hypotheses generated. Each has defined proof criteria.
Phase 5: PROVE — Test Each Hypothesis with Evidence
Systematically confirm or eliminate each hypothesis. No guessing.
For each hypothesis (highest-ranked first):
- Gather confirming evidence:
- Read the specific code paths predicted by the hypothesis
- Check logs/output for patterns the hypothesis predicts
- Run targeted tests that would pass if the hypothesis is correct
- Use
git blame/git logto check if timing matches
- Gather disconfirming evidence:
- Look for code paths that should also fail if the hypothesis is correct but don't
- Check edge cases that contradict the hypothesis
- Check external sources:
- Use WebSearch for known issues in the library/framework version
- Use library-docs skill (context7 MCP) to verify correct API usage
- Search GitHub issues for the library:
mcp__github__search_issues
- Verdict per hypothesis:
- CONFIRMED — evidence supports it, no contradictions
- ELIMINATED — evidence contradicts it
- INCONCLUSIVE — need more evidence (define what)
If all hypotheses are eliminated: Return to Phase 4 with new evidence. Generate new hypotheses.
Output: Evidence log per hypothesis. One confirmed root cause (or request for more data).
Gate: Exactly one root cause confirmed with evidence. Or an explicit statement that the cause requires additional data from the user (with specific questions).
Phase 6: ROOT CAUSE — Document the Causal Chain
Write the definitive explanation before touching any code.
- Document the complete causal chain:
ROOT CAUSE: <the deepest systemic issue> → causes: <intermediate effect> → causes: <intermediate effect> → manifests as: <the symptom the user reported> - Explain why this is the root cause (not just a proximate cause):
- If fixed, would it prevent recurrence? (yes = root cause)
- Is there a deeper cause? (if yes, keep digging)
- Identify the blast radius — what else is affected:
- Are there similar patterns elsewhere in the codebase?
- Use Grep to find analogous code that may have the same bug
- Present the root cause analysis to the user before proceeding to fix
Output: Root cause statement, causal chain, blast radius assessment.
Gate: User understands and agrees with the diagnosis before any fix is attempted.
Phase 7: FIX — Address the Root Cause
Fix the root cause, not the symptom. Minimal, targeted change.
- Design the fix:
- What is the minimum change that eliminates the root cause?
- Does the fix handle all cases in the blast radius (Phase 6)?
- Does the fix introduce any new risks?
- Implement the fix:
- Read every file before modifying it
- Make the smallest change possible
- Add inline comments only where the fix is non-obvious
- Verify the fix:
- Run the reproduction steps from Phase 2 — symptom should be gone
- Run existing tests — no regressions
- Run code-quality agent on modified files if the change is substantial
- Check for similar patterns:
- If the bug was a pattern (e.g., missing null check), search for the same pattern elsewhere
- Fix all instances, not just the reported one
Output: Code changes with explanation of what was changed and why.
Phase 8: PREVENT — Ensure It Never Recurs
The investigation isn't complete until recurrence is prevented.
- Add a regression test that would have caught this bug:
- The test must fail without the fix and pass with it
- Use test-automation agent for comprehensive test generation
- Update project memory if a new pitfall was discovered:
- Add to MEMORY.md under Common Pitfalls
- Include the pattern, why it's dangerous, and the safe alternative
- Suggest structural improvements (optional, only if the bug reveals a design flaw):
- Propose architectural changes that make this class of bug impossible
- Present as a suggestion, not an immediate action
- Write the investigation summary:
## Investigation Report
**Symptom:** <what was reported>
**Root Cause:** <the deepest systemic issue>
**Causal Chain:** root cause → ... → symptom
**Fix:** <what was changed, which files>
**Blast Radius:** <other areas checked/fixed>
**Regression Test:** <test added>
**Prevention:** <memory updated, guard added, pattern documented>
**Time:** <phases completed, hypotheses tested>
Tool Usage by Phase
| Phase | Primary Tools | When to Use Agents |
|---|---|---|
| 1. OBSERVE | Read, Grep, Bash (git log) | — |
| 2. REPRODUCE | Bash (test runner), Playwright MCP | — |
| 3. TRACE | Read, Grep, Glob, Bash (git blame) | Explore agent for broad searches |
| 4. HYPOTHESIZE | sequential-thinking MCP | deep-analysis skill |
| 5. PROVE | Read, Grep, Bash, WebSearch, context7 MCP | library-docs skill, GitHub MCP |
| 6. ROOT CAUSE | Read, Grep | Explore agent for blast radius |
| 7. FIX | Read, Edit, Write, Bash | code-quality agent for review |
| 8. PREVENT | Write, Edit, Bash | test-automation agent for tests |
Anti-Patterns — What This Skill Prevents
| Bad Habit | What /investigate Does Instead |
|---|---|
| Jump straight to fixing | Forces Phases 1-6 before any code change |
| Fix the symptom | 5 Whys drills to root cause |
| Single theory tunnel vision | Requires 2+ competing hypotheses |
| "It works now" without understanding | Demands evidence-based proof |
| Fix one instance, miss others | Blast radius analysis in Phase 6 |
| No regression test | Phase 8 mandates a test |
| Knowledge lost | Memory update in Phase 8 |
When to Use /investigate vs Other Tools
| Situation | Use |
|---|---|
| Bug, crash, error, unexpected behavior | /investigate |
| Build a new feature | /execute |
| Quick "what does this code do?" | Explore agent directly |
| Performance slow but unclear why | /investigate (treat slowness as symptom) |
| Known fix, just need to apply it | Direct Edit — no investigation needed |
| Security vulnerability found | /investigate + security-scan |
References
See references/investigation-frameworks.md for detailed methodology guides.
More from pfangueiro/claude-code-agents
deep-read
Comprehensive codebase reading engine. Systematically reads actual source code line by line through a 6-phase protocol — scoping, structural mapping, execution tracing, deep reading, pattern synthesis, and structured reporting. Source code is the source of truth. Use when needing to truly understand how code works, not just what documentation claims.
46git-workflow
Git workflow best practices and patterns. Use this skill when working with git operations, creating commits, managing branches, handling pull requests, or establishing team git workflows. Provides guidance on commit messages, branching strategies, and collaboration patterns.
11ci-cd-templates
Production-ready CI/CD pipeline templates for GitHub Actions, GitLab CI, and CircleCI
7docker-deployment
Production-ready Docker configurations, multi-stage builds, and deployment best practices
7execute
Orchestrated task execution engine. Decomposes any goal into small atomic tasks, plans dependencies, selects the right agent/tool/MCP server for each, executes in optimally parallel batches, and tracks everything. Use when given a complex, multi-step goal that benefits from structured decomposition and full tool utilization.
6library-docs
Quick access to up-to-date library documentation using MCP. Use this skill when you need to reference official documentation for libraries, frameworks, or APIs. Leverages the context7 MCP server to fetch current docs for React, Next.js, Vue, MongoDB, Supabase, and hundreds of other libraries. Complements the documentation-maintainer agent.
6