adaptive-subagents

SKILL.md

Adaptive Subagent Routing

FIRST: Decompose Before Acting

Do NOT start working on the task directly. Before reading files or making edits, decompose the task into subtasks and delegate each one to the cheapest sufficient model via the Agent tool. The only exception is if the entire task is under ~100 tokens of output (a one-liner fix).

Quick Decision (apply to EVERY delegation)

Default: Haiku. Escalate only with reason.

  1. Is this a mechanical lookup? (find files, grep for a string, list structure, check if something exists) → Haiku
  2. Does it require comprehension or writing? (implement, refactor, test, debug, fix, analyze code, explain logic, summarize complex systems) → Sonnet
  3. Is it architectural, security-sensitive, or ambiguous? → Opus

Haiku boundary: Haiku can find things and format things. It cannot understand or reason about code. If the task requires reading code and making a judgment, that's Sonnet.

If unsure between two tiers, pick the cheaper one. Escalate after failure, not before.

Before each delegation, output **Routing to {Model}:** {brief reason} on its own line. When handling work inline, output **Staying on {Model}:** {brief reason} instead.

When to Delegate vs Handle Inline

Always delegate to a cheaper tier — the 3x–15x cost savings always outweigh subagent overhead.

Skip delegation only when:

  • The output is under ~100 tokens (one-liner answers, quick fixes already in context)
  • The overhead of spawning a subagent exceeds the work itself
  • The subtask needs the same model tier you're already on and context is already available

Routing Rules

Haiku (1x) Sonnet (3x) Opus (15x)
File search, grep, glob Implementation, bug fix Architecture, migration
Format/lint existing text Refactor (< 10 files) Security review
Check if file/string exists Test writing Ambiguous/conflicting scope
List files, directory structure Debugging with stack traces Multi-system design
Literal string replacements Code generation (< 500 lines) Performance analysis
Summarize/analyze code logic
Rename across multiple files
Code review, explain code

Escalate +1 tier: ambiguous requirements, public-facing output, security-sensitive, production-critical, or 2 consecutive failures. Downgrade: after planning completes, or remaining work is formatting/summarization.

Cost Ratios

Model Relative Cost Pricing
Haiku (1x) Baseline ~$1/M input, $5/M output
Sonnet (3x) 3x Haiku ~$3/M input, $15/M output
Opus (15x) 15x Haiku ~$15/M input, $75/M output

A 10-delegation task routed to Haiku instead of Opus saves ~93% on those calls.

Per-Subtask Routing

When a request involves multiple steps:

  1. Decompose first. Break the request into a todo list of subtasks.
  2. Route each subtask independently. Most items are cheaper than the overall task.
  3. Downgrade after planning. If Opus produces a plan, implementation goes to Sonnet. Lookups and cleanup go to Haiku.

Example — "design and implement a caching layer":

  • Plan the architecture → Opus
  • Search for existing cache usage → Haiku
  • Implement the cache module → Sonnet
  • Write tests → Sonnet
  • Update internal dev notes → Haiku
  • Write public API docs → Sonnet (escalated: public-facing)

Delegation

Choose agent type by task and model by complexity independently.

Agent Type Use When Tools Available
Explore Search, grep, read-only exploration Read, Glob, Grep (no Edit/Write)
general-purpose Implementation, edits, multi-step work All tools
Plan Architecture design, implementation planning Read, Glob, Grep (no Edit/Write)

Combine freely — Explore + haiku for file lookups, general-purpose + sonnet for implementation, Explore + sonnet for complex investigation, Plan + sonnet for straightforward design.

Agent(subagent_type: "Explore", model: "haiku", description: "...", prompt: "...")
Agent(subagent_type: "general-purpose", model: "sonnet", description: "...", prompt: "...")
Agent(subagent_type: "Explore", model: "sonnet", description: "...", prompt: "...")
Agent(subagent_type: "Plan", model: "sonnet", description: "...", prompt: "...")

Keep prompts scoped: relevant files, constraints, expected output format. The description parameter is required — use a short 3-5 word summary.

Expected Output Formats

Tier Expected Output
Haiku File paths, grep results, existence checks, formatted text
Sonnet Code diffs, implementation files, test files, error analysis
Opus Numbered plan steps, architecture decisions with rationale, risk assessment

Parallel Fan-Out

When a request contains independent subtasks, launch multiple subagents in a single message.

Before splitting, build a dependency graph. Does any subtask need another's output, or do they modify the same file? If yes, run them sequentially.

Pattern Parallel? Why
Separate files, separate concerns Yes No shared state
Research + unrelated implementation Yes Read-only doesn't conflict with writes
Implementation + its tests No Tests depend on the implementation
Feature + docs describing that feature No Docs need to reflect what was built
Two features touching different modules Yes Independent code paths
Refactor + anything in same files No Refactor changes the baseline

After all phases return, review outputs for conflicts before applying.

User Overrides

If the user says any of the following, respect it for the current request:

  • "no routing" / "don't delegate" — handle everything inline on the current model.
  • "use Opus" / "use Sonnet" / "use Haiku" — force that tier for all delegations.

Log the override with **Override:** {what the user requested} and resume normal routing on the next request.

Routing Log

After each routing decision, append a line to routing.log in the project root using the Bash tool:

echo "{model} | {agent_type or inline} | {estimated savings vs Opus, e.g. ~93%} | {brief task description}" >> routing.log

Log every **Routing to** and **Staying on** decision.

Guardrails

  • Max 2 retries per subagent, then escalate or ask the user.
  • Max 5 sequential delegations per user request. No limit on parallel.
  • Always review subagent outputs before applying — check for conflicts, hallucinated paths, and incomplete work.
  • Do NOT modify project files: Never write routing rules to CLAUDE.md. Never add hooks to settings.json. This skill is loaded automatically by the plugin.
Weekly Installs
16
First Seen
Feb 27, 2026
Installed on
gemini-cli16
github-copilot16
codex16
amp16
cline16
kimi-cli16