Multi-Agent Orchestration

Overview

Sequential execution of parallelizable work is wasted time. But naive parallelization creates merge conflicts, duplicated effort, and incoherent results.

Core principle: Decompose into independent bounded tasks, execute in parallel with isolated contexts, verify before merging. The orchestrator thinks; the workers execute; the judge validates.

Violating task boundaries is violating the entire pattern.

The Iron Law

EACH AGENT GETS ONE BOUNDED TASK WITH CLEAR ACCEPTANCE CRITERIA

If you cannot write a two-sentence task description with a concrete "done when" condition, the task is not ready for delegation.

When to Use

Use when work decomposes into independent units:

Writing tests for multiple modules simultaneously
Implementing features that touch separate files/modules
Reviewing multiple PRs or components in parallel
Migrating multiple services/packages to a new pattern
Generating documentation for separate subsystems
Refactoring parallel, non-overlapping code paths

Use this ESPECIALLY when:

You have 3+ tasks that don't depend on each other
Each task takes more than a few minutes of focused work
The work is well-understood enough to specify clearly upfront
You need to maximize throughput under time pressure

Do NOT use when:

Tasks have sequential dependencies (B needs output of A)
Work requires continuous shared state or conversation
The problem is not yet understood (use systematic-debugging first)
There's only one task (overhead of orchestration exceeds benefit)
Tasks heavily overlap in the files they modify

The Three Roles

Planner (You, the Orchestrator)

The Planner never writes code. The Planner decomposes, delegates, and coordinates.

Responsibilities:

Analyze the work and identify independent units
Define task boundaries and acceptance criteria
Assign each task to a Worker agent
Collect results and resolve conflicts
Dispatch Judge agent for quality verification

Worker (Subagent via Task Tool)

Each Worker receives ONE bounded task and works in isolation.

Characteristics:

Operates on a defined scope (specific files, specific module, specific test suite)
Has no knowledge of other Workers or their tasks
Produces a concrete deliverable (code, review, documentation)
Reports success/failure with evidence

Judge (Subagent via Task Tool)

The Judge reviews Worker output against acceptance criteria.

Characteristics:

Receives the original task description + Worker output
Evaluates against acceptance criteria only
Returns PASS/FAIL with specific reasons
Never fixes problems — sends back to a new Worker if FAIL

The Five Phases

Phase 1: Task Decomposition

BEFORE spawning any agents:

Map the Work
- List all units of work
- Identify file-level boundaries
- Mark dependencies between units
Find Independence
- Group work into non-overlapping scopes
- A task is independent when: changing its output cannot break another task's output
- If two tasks touch the same file → they are NOT independent (merge or sequence them)

Write Task Specifications

Each task spec MUST include:

TASK: [One-sentence description]
SCOPE: [Exact files/modules/functions this task may touch]
INPUT: [What the agent needs to know — context, requirements, examples]
ACCEPTANCE CRITERIA: [Concrete "done when" conditions]
CONSTRAINTS: [What the agent must NOT do — files to avoid, patterns to follow]

Verify Decomposition
- Do any two tasks share files in their SCOPE? → Merge or sequence them
- Can each task be completed without knowledge of the others? → If no, add missing INPUT
- Are acceptance criteria testable? → If no, rewrite them

Phase 2: Workspace Isolation

Each Worker needs an isolated context. Two strategies:

Strategy A: Git Worktrees (Preferred for Code Changes)

REQUIRED SUB-SKILL: Use using-git-worktrees to create one worktree per Worker.

# Orchestrator creates worktrees before spawning Workers
git worktree add .worktrees/task-auth -b orchestrate/task-auth
git worktree add .worktrees/task-api -b orchestrate/task-api
git worktree add .worktrees/task-tests -b orchestrate/task-tests

Why worktrees: Each Worker gets a full working copy. No merge conflicts during execution. Workers can build and test independently.

Strategy B: Scope Isolation (For Reviews, Analysis, Documentation)

When Workers produce output that doesn't modify the repository:

Each Worker reads from the same codebase
Each Worker writes to separate output locations
No worktrees needed — shared read access is safe

Phase 3: Worker Execution

Spawn Workers using the Task tool with precise prompts.

Each Task tool invocation MUST include:

The complete task specification from Phase 1
The working directory (worktree path if using Strategy A)
Relevant context the Worker cannot discover on its own
Explicit instruction to report completion status

Task prompt template:

You are a Worker agent. Complete this task and NOTHING ELSE.

TASK: Write unit tests for the authentication module.
SCOPE: Only modify files in src/auth/__tests__/
INPUT: The auth module exposes login(), logout(), refreshToken().
  Each function returns a Promise. See src/auth/index.ts for signatures.
ACCEPTANCE CRITERIA:
  - Tests cover all three exported functions
  - Tests cover success and error paths
  - All tests pass when run with `npm test -- --testPathPattern=auth`
  - No modifications outside src/auth/__tests__/
CONSTRAINTS:
  - Do NOT modify source code, only test files
  - Use existing test utilities from test/helpers.ts
  - Follow test naming patterns from src/users/__tests__/ as reference

Working directory: /project/.worktrees/task-auth-tests

When complete, report: PASS (all criteria met) or FAIL (which criteria unmet and why).

Parallel execution:

Spawn all Workers simultaneously using multiple Task tool calls
Do NOT wait for one Worker before spawning the next
The orchestrator waits for all Workers to complete

Phase 4: Result Merging

After all Workers complete:

Collect Results
- Gather each Worker's completion status and output
- Any FAIL? → Triage: re-specify and re-dispatch, or handle manually

Merge Strategy (for code changes)

# Start from main development branch
git checkout main

# Merge each Worker's branch
git merge orchestrate/task-auth --no-edit
git merge orchestrate/task-api --no-edit
git merge orchestrate/task-tests --no-edit

If merge conflict:

STOP. Conflicts mean task boundaries were wrong.
Resolve manually. Document what overlapped.
Tighten boundaries for next time.

Merge Strategy (for non-code output)
- Concatenate, deduplicate, or integrate outputs as appropriate
- Check for contradictions between Worker outputs

Phase 5: Quality Gate (Judge)

BEFORE declaring done, spawn a Judge agent.

You are a Judge agent. Review the following work against its acceptance criteria.

ORIGINAL TASK: [paste full specification]
WORKER OUTPUT: [paste result or point to changed files]

Evaluate EACH acceptance criterion:
- MET / NOT MET with specific evidence

Final verdict: PASS (all criteria met) or FAIL (list unmet criteria).

Do NOT fix anything. Only evaluate.

If Judge returns FAIL:

Create a new Worker task addressing ONLY the unmet criteria
Re-run Phase 3-5 for that specific remediation
Do NOT re-run the entire orchestration

Communication Patterns

Orchestrator to Worker

Always: Complete task spec, working directory, relevant context
Never: Vague instructions, references to "the other tasks", shared mutable state

Worker to Orchestrator

Always: Completion status (PASS/FAIL), evidence of completion, list of files changed
Never: Questions requiring orchestrator judgment mid-task (if Worker needs to ask, the task spec was incomplete)

Orchestrator to Judge

Always: Original task spec AND Worker output together
Never: Just the output without the criteria to judge against

Error Handling

Situation	Action
Worker reports FAIL	Re-read its output. Is the task spec wrong? Fix spec and re-dispatch. Is the code genuinely broken? Fix in a new focused Worker.
Worker modifies files outside its SCOPE	Discard that Worker's output entirely. The boundary violation means results are untrustworthy. Re-dispatch with stricter constraints.
Merge conflict between Worker branches	Resolve manually. This means your decomposition had overlapping scopes. Document and prevent next time.
Worker hangs or produces no output	Kill and re-dispatch. Likely the task spec was ambiguous or the scope was too large. Break it down further.
Judge returns FAIL on specific criteria	Spawn a NEW Worker to fix only the failing criteria. Do not re-run the entire task.
Multiple Workers fail on the same issue	STOP orchestration. Shared problem indicates missing context in Phase 1. Re-analyze before re-dispatching.

Practical Examples

Example 1: Parallel Test Writing

DECOMPOSITION:
  Task 1: Write unit tests for src/auth/ → worktree: task-auth-tests
  Task 2: Write unit tests for src/billing/ → worktree: task-billing-tests
  Task 3: Write unit tests for src/notifications/ → worktree: task-notif-tests

WHY PARALLELIZABLE: Each test suite touches only its own __tests__/ directory.
No shared files. No shared state.

MERGE: Fast-forward merges (no conflicts possible if scopes are respected).
JUDGE: Run full test suite. All tests pass. No regressions.

Example 2: Parallel Feature Implementation

DECOMPOSITION:
  Task 1: Add REST endpoint for /api/export (src/api/export.ts, src/api/export.test.ts)
  Task 2: Add CSV formatter (src/formatters/csv.ts, src/formatters/csv.test.ts)
  Task 3: Add email delivery integration (src/delivery/email.ts, src/delivery/email.test.ts)

WHY PARALLELIZABLE: Each feature is in a separate module with a defined interface.
Workers implement to the interface spec, not to each other's code.

MERGE: Sequential merges. Integration test after all merges.
JUDGE: Each module works in isolation. Integration test passes.

Example 3: Parallel Code Review

DECOMPOSITION:
  Task 1: Review PR #42 for correctness and edge cases
  Task 2: Review PR #43 for security implications
  Task 3: Review PR #44 for performance impact

WHY PARALLELIZABLE: Each review is independent read-only analysis.
No file modifications. No worktrees needed.

MERGE: Collect all review comments. Deduplicate if PRs overlap.
JUDGE: Not needed (reviews are already evaluative).

Red Flags - STOP and Re-Decompose

If you catch yourself thinking:

"This Worker needs to know what the other Worker is doing"
"I'll have them share this utility file"
"Let me update the shared config and tell both Workers"
"Worker 2 should wait for Worker 1's output"
"I'll just have one Worker do both since they're related"
"The merge conflicts aren't that bad, I'll resolve them"

ALL of these mean: Your decomposition is wrong. Return to Phase 1.

Common Rationalizations

Excuse	Reality
"Tasks are mostly independent, a little overlap is fine"	A little overlap guarantees merge conflicts. Eliminate overlap or sequence the dependent parts.
"I'll coordinate between Workers as they go"	Workers are isolated. Mid-task coordination means the task wasn't properly specified.
"One big task is easier than decomposing"	One big task is sequential. Decomposition is the entire point. If it can't decompose, don't use this pattern.
"The Judge step is overhead, I'll skip it"	Skipping quality gates means shipping unverified work. The Judge catches what each Worker misses about the whole.
"I'll parallelize everything for maximum speed"	Over-parallelization creates merge hell. Parallelize ONLY what is truly independent.
"Workers can figure out the details"	Vague specs produce vague results. Every minute spent on task specs saves ten minutes of rework.
"Just spawn more agents for more speed"	Diminishing returns. Communication overhead grows. 3-5 parallel Workers is usually the sweet spot.
"I'll fix the boundaries after seeing what conflicts"	Fix boundaries BEFORE execution. Post-hoc conflict resolution is the expensive path.

Quick Reference

Phase	Key Activities	Success Criteria
1. Decompose	Map work, find independence, write task specs	Each task has bounded scope, clear criteria, zero overlap
2. Isolate	Create worktrees or scope boundaries	Each Worker has isolated workspace
3. Execute	Spawn Workers in parallel via Task tool	All Workers complete with PASS status
4. Merge	Combine results, resolve any conflicts	Clean merges, integrated output
5. Judge	Quality gate review against criteria	All acceptance criteria verified MET

Integration with Other Skills

This skill requires using:

using-git-worktrees - REQUIRED when Workers make code changes (see Phase 2, Strategy A)

Complementary skills:

writing-plans - Use to create the implementation plan BEFORE decomposing into parallel tasks
executing-plans - Each Worker follows plan execution within its bounded scope
systematic-debugging - Use when a Worker fails and the cause is unclear
test-driven-development - Workers writing code should follow TDD within their scope
requesting-code-review - Judge agent role can be formalized as a code review request

Called by:

brainstorming - When the design phase produces work that decomposes into parallel tracks
writing-plans - When a plan identifies parallelizable implementation phases

Decision: Parallel vs Sequential

Can tasks run without knowing each other's output?
  YES → Parallel (use this skill)
  NO  → Do they form a strict chain (A→B→C)?
          YES → Sequential (don't use this skill)
          NO  → Partial: group independent tasks, sequence dependent groups

The overhead rule: If you have fewer than 3 independent tasks, or each task takes under 2 minutes, the orchestration overhead exceeds the parallelization benefit. Just do them sequentially.

multi-agent-orchestration