iterative-fleet
Iterative Fleet
A skill for reviewer-gated iterative loops of parallel claude -p or codex exec workers. Workers run per-iteration, a reviewer reads their output and writes a verdict, and a generated orchestrator decides whether to continue or stop — without ever killing or restarting workers. Supports both Claude and Codex providers — set per-fleet or per-worker. See dag-fleet SKILL.md for full codex provider documentation (model aliases, reasoning_effort, limitations).
When to use this skill
Reach for iterative-fleet when:
- The work needs multiple rounds of refinement (not one-shot)
- A reviewer/verifier must approve output before the work is done
- You want operator-declared stop conditions (max iterations, LGTM count, cost cap)
- Workers are long-running or have high bootstrap costs (no auto-restart — see CRITICAL section)
Use dag-fleet instead when:
- Workers run once and are done (no iteration needed)
- There is no reviewer quality gate
Use worktree-fleet instead when:
- Tasks are fully independent with no shared state
CRITICAL: No auto-kill, no auto-restart
The orchestrator generated by this skill reads and decides only. It NEVER kills or restarts workers. Workers run to natural completion per iteration. This design comes from a $20 death spiral in experiment 001 where auto-restart on "stuck" workers caused cache rebuilds that cost more than the actual work. The reviewer is the quality gate. The operator is the kill switch.
fleet.json schema
{
"fleet_name": "my-iterative-fleet",
"type": "iterative",
"config": {
"max_concurrent": 3,
"model": "sonnet",
"fallback_model": "haiku",
"max_iterations": 10,
"cost_cap_usd": 10.0
},
"workers": [
{ "id": "builder-a", "type": "code-run", "task": "...", "max_budget_per_iter": 1.0 },
{ "id": "builder-b", "type": "code-run", "task": "...", "max_budget_per_iter": 1.0 },
{ "id": "reviewer", "type": "reviewer", "task": "Review output, write verdict: lgtm | iterate | escalate",
"depends_on": ["builder-a", "builder-b"] }
],
"stop_when": {
"reviewer_lgtm_count": 3,
"max_iterations": 10,
"cost_cap_usd": 10.0
}
}
DAG ordering via depends_on
Each iteration is a DAG. Workers declare dependencies with depends_on:
{ "id": "reviewer", "type": "reviewer", "depends_on": ["builder-a", "builder-b"] }
At launch, only layer-0 workers (no deps) are spawned. The orchestrator spawns subsequent layers after their dependencies complete, then repeats the DAG on the next iteration.
iteration 1: [builder-a, builder-b] → [reviewer] → verdict: iterate
iteration 2: [builder-a, builder-b] → [reviewer] → verdict: lgtm → stop
Multi-layer DAGs work too (e.g., researcher → builder → reviewer = 3 layers). The shared lib/dag.sh provides topo-sort and dependency checking, reusable across all fleet types.
If no worker has depends_on, all workers launch in parallel (backward compatible).
Iteration directory structure
$FLEET_ROOT/
fleet.json
iterations/
1/
builder-a.log
builder-b.log
review.md # reviewer writes verdict here
2/
...
workers/
builder-a/
prompt.md
session.jsonl
...
orchestrator.sh # generated — reads logs, decides iterate/pause/stop
.paused # exists when paused (touch to pause, rm to resume)
Reviewer interface
The reviewer worker reads iterations/<N>/*.log and writes iterations/<N>/review.md. The review.md MUST contain one of:
verdict: lgtm— output is approved, count toward stop conditionverdict: iterate— needs another roundverdict: escalate— needs human attention, orchestrator pauses
Reviewer prompt requirements
Every reviewer prompt.md MUST include these instructions verbatim (with paths adjusted). Without them, the reviewer won't know where to write the verdict and the orchestrator defaults to iterate, wasting an iteration.
## Writing your verdict
1. Determine the current iteration number: list the `iterations/` directory and find the
highest-numbered subdirectory that does NOT yet contain a `review.md`.
2. Write your verdict to `iterations/<N>/review.md` (relative to your working directory).
**Never use absolute paths.**
3. The file MUST contain a line exactly like one of:
- `verdict: lgtm`
- `verdict: iterate`
- `verdict: escalate`
4. Below the verdict line, list **actionable fix instructions** per worker — not just what's
wrong, but exactly where and how to fix it (file path, function name, what to change).
The builder sees this feedback on the next iteration, so vague issues waste a cycle.
Example `iterations/1/review.md`:
verdict: iterate
builder-a
src/parser.py:parse_input()— no try/except around JSON decode. Wrap lines 45-48 in try/except json.JSONDecodeError, return None on failure.src/parser.py:validate_schema()— missing required field "timestamp" in schema dict at line 72. Add"timestamp": {"type": "string", "required": True}.
builder-b
src/utils.py—format_output()defined but not in__all__or exported in__init__.py. Add tosrc/__init__.pyline 5:from .utils import format_output.
Available scripts
| Script | Purpose |
|---|---|
launch.sh <fleet-root> [--dry-run] |
Parse fleet.json, generate orchestrator.sh, spawn workers + orchestrator in tmux |
status.sh <fleet-root> |
Show iteration count, reviewer verdict history, per-worker status, cost |
pause.sh <fleet-root> |
Touch .paused — orchestrator stops at next iteration boundary |
resume.sh <fleet-root> |
Remove .paused — orchestrator continues |
kill.sh <fleet-root> all [--force] |
Hard stop: kill tmux session, sweep orphans, unregister |
Launch procedure
- Create
$FLEET_ROOT/fleet.jsonwith workers including exactly onetype: "reviewer"worker (withdepends_onpointing to builder workers) - Create
$FLEET_ROOT/workers/<id>/prompt.mdfor each worker - Run
bash ${CLAUDE_SKILL_DIR}/scripts/launch.sh $FLEET_ROOT - ALWAYS tell the user the exact status command so they can monitor manually:
This is mandatory after every launch. The user must be able to check status without asking you.bash ${CLAUDE_SKILL_DIR}/scripts/status.sh $FLEET_ROOT - Pause if needed:
bash ${CLAUDE_SKILL_DIR}/scripts/pause.sh $FLEET_ROOT - Hard stop:
bash ${CLAUDE_SKILL_DIR}/scripts/kill.sh $FLEET_ROOT all
Rationalizations to reject
| Agent says | Rebuttal |
|---|---|
| "The worker has been running for 10 minutes with no output — it must be stuck, I should pause it" | Long thinking blocks look like silence. The orchestrator waits for result events, not timeouts. This is exactly the failure mode that caused the $20 death spiral in experiment 001. Do not intervene. |
| "The reviewer said 'looks mostly good' — I'll count that as LGTM" | Only verdict: lgtm counts. "Mostly good" is iterate. If the reviewer is ambiguous, the verdict is iterate. Do not interpret generously. |
| "I should kill this worker and restart it with a better prompt" | The orchestrator NEVER kills workers. Workers run to natural completion. If you need a different prompt, pause the fleet, edit prompt.md, and let the next iteration pick it up. |
| "The cost is getting high — I'll reduce max_iterations mid-run" | Stop conditions are baked into orchestrator.sh at generation time. To change them, kill the fleet, regenerate with new fleet.json, and relaunch. Do not edit orchestrator.sh directly. |
| "I can skip the reviewer for this simple task" | If the task doesn't need a reviewer, use dag-fleet (one-shot) or worktree-fleet (independent). Iterative-fleet without a reviewer is a runaway loop. |
Decision tree: which fleet?
1. Tasks independent (no shared files/state)? YES → worktree-fleet
2. Need iteration with reviewer quality gate? YES → iterative-fleet (this skill)
3. One-shot DAG with dependencies? YES → dag-fleet
4. None of the above? → open multiple Claude Code sessions
$ARGUMENTS