investigate

Installation

SKILL.md

Investigate

Systematic methodology for finding the root cause of bugs, failures, and unexpected behavior. Cycle through characterize-isolate-hypothesize-test steps, with oracle escalation for hard problems. Diagnose the root cause — do not apply fixes.

Optional: $ARGUMENTS contains the problem description or error message.

Step 1: Characterize

Gather the symptom and establish what is actually happening:

Collect evidence — error message, stack trace, test output, log entries, or user description of unexpected behavior
Classify the problem type:

Signal	Type
Stack trace / exception	Runtime error
Test assertion failure	Test failure
Compilation / bundler / build error	Build failure
Type checker error (tsc, mypy, pyright)	Type error
Slow response / high CPU / memory growth	Performance
"It does X instead of Y" / no error	Unexpected behavior

Establish reproduction — run the failing command, test, or operation. If the problem cannot be reproduced (intermittent, environment-specific), document the constraints and proceed with historical evidence.

Record the exact reproduction command and its output for verification. For intermittent or long-running reproductions, use the Monitor tool to tail logs filtered for relevant signals (errors, stack traces, specific identifiers) so failures surface live while you work.

Step 2: Isolate

Narrow from "something is wrong" to "the problem is in this area." Read references/problem-type-playbooks.md for type-specific first moves and tool sequences.

Git Archeology

For all problem types, check what changed recently near the failure point:

git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>

If a known-good state exists (e.g., "this worked yesterday"), consider git bisect to pinpoint the breaking commit.

Scope Narrowing

Stack traces: Read the throwing function and its callers — full functions, not just the flagged line
Test failures: Read both the test and the system under test
Build errors: Read the config file and the referenced source
Unexpected behavior: Trace the data flow from input to the unexpected output

Step 3: Hypothesize

Generate 2-4 hypotheses ranked by likelihood. Each hypothesis must be falsifiable — specify what evidence would confirm or refute it.

Format:

H1 (most likely): [description] — confirmed if [X], refuted if [Y]
H2: [description] — confirmed if [X], refuted if [Y]
H3: [description] — confirmed if [X], refuted if [Y]

Parallel Investigation

For complex problems with 3+ hypotheses and a non-obvious root cause, spawn parallel investigators simultaneously.

Spawn condition: 3+ hypotheses AND the problem is not a simple typo, missing import, or syntax error.

Skip when 1-2 hypotheses are obvious (e.g., stack trace points directly to the bug).

Use the Agent tool to launch all agents below in a single assistant message so they run concurrently. Each Agent call uses model: "opus" and does not set run_in_background. Expect (one Agent per hypothesis + one Codex Agent) total. State the count explicitly when emitting the calls.

Hypothesis Agent (one per hypothesis): Each receives the hypothesis, relevant file paths, what evidence to look for, and instructions to report confirmed / refuted / inconclusive with evidence. Budget: max 5 tool calls per subagent.
Codex Agent: Launch one Agent whose prompt instructs the subagent to invoke /consult-codex via the Skill tool with a focused prompt describing the problem, reproduction, and files examined. The multi-turn conversation allows it to dig deeper into patterns the hypothesis-driven subagents miss. Run the /evaluate-findings skill on its output after the Agent returns.

After all investigators complete, merge results. Codex findings that overlap with a subagent's confirmed hypothesis reinforce confidence. Novel codex findings become additional hypotheses to test in Step 4.

Step 4: Test

Verify each hypothesis with minimal, targeted actions:

Action Type	Tool
Find usage or pattern	Grep
Read surrounding code	Read
Check recent changes	Bash (`git log`, `git blame`, `git diff`)
Run isolated test	Bash (specific test command)
Check dependency version	Bash (`npm ls`, `pip3 show`, etc.)
Inspect runtime state	Bash (add temporary logging, run, check output)

Record each result:

Hypothesis	Verdict	Evidence
H1	confirmed / refuted / inconclusive	[what was found]
H2	confirmed / refuted / inconclusive	[what was found]

Iteration

If all hypotheses are refuted or inconclusive:

Document what was learned — each refuted hypothesis eliminates a possibility and narrows the search
Return to Step 2 with the new information to re-isolate
Generate new hypotheses in Step 3 based on updated understanding

Cycle budget: maximum 2 full cycles (hypothesize → test → learn → repeat) before escalating.

Escalation

After 2 failed hypothesis cycles, offer escalation to /consult-oracle via AskUserQuestion:

Investigation stalled after [N] hypothesis cycles.

Tested: [summary of hypotheses and evidence]
Remaining unknowns: [what is still unclear]

Escalate to Oracle? (consults external model with full context)

Proceed only if the user approves.

Investigation Report

Output results as text:

Investigation Report:

Problem: [one-line description]
Type: [runtime error | test failure | build failure | type error | performance | unexpected behavior]
Root cause: [confirmed cause, or "unresolved" with best hypothesis]

Evidence:
- [what confirmed the root cause]

Suggested fix: [description of what to change, or "needs further investigation"]
Reproduction command: [command to verify the fix once applied]

Hypotheses tested:
1. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]
2. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]

Escalation: [none | oracle]

Then use the TaskList tool and proceed to any remaining task.

Rules

If the problem turns out to be environmental (wrong Node version, missing dependency, OS-specific), report that clearly — it may not require a code fix.
If the problem is in a dependency (not the project's code), document the dependency issue and suggest workaround options rather than patching the dependency.

Related skills

More from tobihagemann/turbo

Installs

Repository

tobihagemann/turbo

GitHub Stars

295

First Seen

Mar 11, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass