competing-hypotheses
Competing Hypotheses
Debug problems by racing multiple theories in parallel. Each investigator pursues a different hypothesis, gathers evidence, and reports back. The lead compares findings to identify the root cause.
When to Use
- "I have no idea why this is broken"
- A bug that could have multiple root causes
- Unexpected behaviour with no obvious source
- Performance regressions with unclear origin
- Intermittent failures that are hard to reproduce
Instructions for Claude
You are the lead investigator coordinating a parallel hypothesis investigation.
Coordination Protocol
Messages between teammates are asynchronous — a message sent now may not be read until the recipient finishes their current work. You cannot rely on message timing for coordination. Instead, task status is the shared state that tells every agent where things stand.
Task Status as Position Marker
When a teammate receives a message, they determine where it sits in the conversation by checking their task status — not by assuming it arrived "just now."
| Status | Who sets it | Meaning |
|---|---|---|
pending |
Lead | Not started, waiting for assignment |
in_progress |
Teammate | Working, or finished and parked waiting for lead to acknowledge |
completed |
Lead only | Lead has read the teammate's report — this IS the acknowledgment |
The lead marks tasks completed — not the teammate. When a teammate sees their task marked completed, they know the lead has processed their report and any new message is current.
Teammate Protocol
Include these rules in every teammate's spawn prompt:
- Mark your task
in_progresswhen you begin work - Read your task with
TaskGet— the task description contains everything you need (fix details, implementation instructions, etc.). Do NOT search the filesystem or other agents' files for this content. - If your task description is missing required content (e.g., an implementation task with no fix details), tell the lead immediately and park. Do not improvise.
- When done, send your report via
SendMessage, then park — stop all work, do not checkTaskListor claim new tasks. Just wait. - Before acting on any received message, check your task status via
TaskGet:- Still
in_progress→ lead hasn't acknowledged your report yet. This message may pre-date your report. Reply with your current state instead of re-executing. completed→ lead has processed your report. If a new task is assigned to you, this message contains current instructions — proceed.
- Still
- Wait for all spawned subagents to finish before sending your report. Do not leave background work running.
Lead Protocol
- After reading a teammate's report, mark their task
completed(your acknowledgment) - Before sending new instructions, ensure the previous task is
completedand the new task is created/assigned - Verify phase completion via
TaskList— check that all relevant tasks show the expected status, don't track messages mentally - Between implementation steps, run
git statusto confirm a clean working tree before proceeding
Phase 1: Hypothesize
- Understand the problem from the user's input:
- What's the symptom? (error message, wrong output, unexpected behaviour)
- When does it happen? (always, sometimes, after a recent change)
- What's already been tried?
- Generate 2-5 plausible hypotheses for the root cause
- Each should be distinct and testable
- Cover different areas (data, logic, infrastructure, external dependencies, timing)
- Present the hypotheses to the user:
- List each hypothesis with a brief rationale
- Ask: "I'll spin up N investigators to pursue these in parallel. Proceed?"
- Incorporate any hypotheses the user wants to add or remove
Phase 2: Parallel Investigation
- Create a team with
TeamCreate - Create tasks for each hypothesis with
TaskCreate - Spawn one
general-purposeteammate per hypothesis usingTaskwithteam_name- Name them after their hypothesis (e.g.,
race-condition-investigator,data-corruption-investigator) - Each investigator's prompt should include:
- The overall problem description
- Their specific hypothesis to pursue
- Instruction to investigate only, do not make changes
- The Teammate Protocol from the Coordination Protocol above (copy it into their prompt verbatim)
- What evidence to look for (see Investigation Guide below)
- Instruction to report findings via
SendMessage
- Name them after their hypothesis (e.g.,
- Spawn all investigators in parallel
- As investigators report back, mark each investigation task
completed(acknowledging the report) and give the user brief progress updates - If an investigator discovers a recent commit already resolved the issue, report the finding to the user and end early if they confirm it's fixed
Subagent Guidance for Investigators
Include the following in each investigator's prompt:
Use subagents (
Tasktool) to keep your context focused. Spawn subagents for:
- Exploring specific files, modules, or subsystems
- Searching through git history, logs, or large codebases
- Any research tangent that might not pan out
Each subagent should report back:
- Relevant findings — what it discovered that matters to your investigation
- Red herrings (1-2 sentences) — anything that looks related but isn't, and why. Calling these out early prevents wasted cycles re-exploring dead ends.
Report red herrings even when your main findings are conclusive — they prevent other agents from re-exploring the same dead ends.
After receiving a subagent's report, decide whether to:
- Use its findings directly — if the summary gives you enough to proceed
- Dive in yourself — if the subagent found something promising and you want full, first-hand context in that area before drawing conclusions. Examples: conflicting evidence that needs direct examination, low confidence in the subagent's assessment, or complex state/flow where first-hand context matters.
When choosing subagent types, prefer read-only or exploration-focused types for open-ended codebase searches, and full-capability types for targeted analysis or tasks that need write access.
Investigation Guide
Each investigator should:
- Search for evidence supporting their hypothesis
- Read relevant code paths
- Check logs, error messages, stack traces if available
- Look at recent changes (git log, git diff) that could be related
- Examine configuration, environment, data
- Search for counter-evidence that would disprove their hypothesis
- Rate their confidence based on what they found
- Report using the output format below
Investigator Output Format
## Hypothesis: {description}
### Evidence For
- {evidence point}: {where found, what it means}
### Evidence Against
- {evidence point}: {where found, what it means}
### Red Herrings
- {code paths or areas explored that looked related but weren't, and why}
### Confidence: {high/medium/low}
### Root Cause (if found)
{specific root cause, file, line, mechanism}
### Suggested Fix
{what to change and why}
### Open Questions
- {anything unresolved that could help narrow it down}
Phase 3: Compare & Conclude
- Once all investigation tasks show
completedinTaskList, compare findings:- Which hypothesis has the strongest evidence?
- Did any investigator find definitive proof?
- Do findings from different investigators corroborate each other?
- Are there open questions that could be quickly resolved?
- Compound bugs — if multiple hypotheses are confirmed, present as a multi-root-cause scenario and propose fixing in dependency order (fix the cause that enables the others first)
- Present the analysis to the user:
- Rank hypotheses by evidence strength
- Highlight the most likely root cause
- Note any surprising findings or ruled-out theories
- Recommend next steps (fix, further investigation, or targeted test)
Phase 4: Fix (Optional)
Skip this phase if the user only wanted diagnosis, not a fix.
- If the root cause is clear and the user wants to proceed, follow the Lead Protocol:
a. Create an implementation task. Include in the task description: the fix details (root cause, what to change, which files, expected outcome) and the subagent guidance for implementation
b. Assign the task to the investigator who found the root cause and send them a message saying their implementation task is ready — the task description contains everything they need
c. Wait — the investigator will implement, send a report, and park
d. Read the report. Mark the implementation task
completed(your acknowledgment). e. Rungit statusto confirm a clean working tree - If the root cause is unclear:
- Propose targeted experiments to disambiguate
- Ask the user which direction to pursue
- For compound bugs (multiple root causes), implement fixes one at a time — repeat step 1 for each, verifying clean git state between each fix
- After all fixes, verify via
TaskListthat all implementation tasks arecompletedandgit statusshows a clean working tree. Then spawn a freshvalidatorteammate. The validator's spawn prompt must include: the Teammate Protocol (verbatim), the original symptom, the confirmed hypothesis/root cause, and what the fix was intended to do. - If validation fails, route the failure back to the investigator who implemented the fix for corrections, then re-validate
Rules
- Task status is the source of truth — coordinate through
TaskUpdatestatus, not message timing. Always checkTaskListto verify state. - Teammates park after reporting — after sending a report, stop and wait. Do not self-assign new work or act on queued messages without checking task status first.
- Lead owns
completed— only the lead marks taskscompleted. This is the acknowledgment that closes the loop. - Keep investigators alive until the conclusion — they may need follow-up questions
- 2-5 hypotheses max — too many dilutes focus
- Investigators don't communicate — they work independently to avoid confirmation bias
- Evidence over intuition — rank hypotheses by concrete evidence, not plausibility
- Counter-evidence matters — a hypothesis with strong counter-evidence should be deprioritized even if it seems likely
- Finish subagents before reporting — wait for all spawned subagents to complete before sending your report
- Tasks carry the content — implementation tasks must include the full fix details in the task description. Teammates should
TaskGettheir assigned task to find everything they need. Do NOT search the filesystem for instructions. - Missing content? Park and ask. — if a teammate receives a task but the description doesn't contain the details they need, they should immediately tell the lead and stop. Do not improvise by searching elsewhere.
- Shut down when done — after validation passes, or after the user declines to fix, send shutdown requests and wait for confirmations before reporting final results
- Unresponsive teammate? — if a teammate hasn't reported within a reasonable timeframe, check their task status. If stuck, spawn a replacement and inform the user.