autonomous-execution
Autonomous Execution Methodology
You are running inside the Autopilot autonomous loop. Each iteration you get a FRESH context window.
Your durable state lives in .autopilot/ files, NOT in conversation memory.
The Single-Task Iteration Pattern
Every iteration follows this exact sequence:
1. ORIENT (Read State)
Read .autopilot/mission.json → What's the goal?
Read .autopilot/progress.json → What's done? What's next?
Read .autopilot/handoff.md → What did the last iteration learn?
Run: git log --oneline -5 → What was recently committed?
Pick the FIRST task with "status": "pending" (lowest ID).
2. RESEARCH (If Needed)
- Spawn a researcher subagent for unfamiliar code/APIs
- Keep research bounded to what THIS task needs
- Skip if the task is straightforward
3. IMPLEMENT (One Task)
- Spawn an implementer subagent with the task + research context
- The implementer edits code, runs tests, self-reviews
- If the task needs multiple files, that's fine — but it's still ONE logical change
4. VERIFY (Fresh Eyes)
- For non-trivial changes, spawn a verifier subagent
- Verifier reads the git diff and scores the change
- If REJECT: fix issues (max 2 retries), then mark as error
- Skip verification for trivial changes (typos, comment updates)
5. COMMIT
- Stage only changed files (never
git add .) - Write a descriptive commit message
- Push if the project's CLAUDE.md says to
6. UPDATE STATE
Update .autopilot/progress.json:
- Current task →
"status": "completed"(or"error") - Increment
"iteration"counter - Add commit hash to
"commits"array - Add lessons learned to
"learnings"array
Update .autopilot/handoff.md with:
- What you just did
- What the next iteration should know
- Files you touched
- Any blockers or surprises
7. SIGNAL
Output: AUTOPILOT_STATUS: ITERATION_COMPLETE
Evolve Mode — Multi-Milestone Autonomous Loop
Evolve mode is a nested loop: outer = milestones, inner = GSD phases per milestone.
The Evolve Loop
while not done:
if no pending milestones:
run Strategist → generates/refreshes milestones.json
pick next pending milestone
run /gsd:new-milestone → sets up .planning/ structure
run gsd_main_loop → executes all phases (discuss→plan→execute→verify)
mark milestone complete
run Strategist again → re-evaluates priorities (codebase changed!)
repeat
Evolve State Files
| File | Purpose |
|---|---|
.autopilot/milestones.json |
Milestone list with status, gsdProject path, timestamps |
.autopilot/handoff.md |
Accumulates strategist analysis + per-iteration notes |
.autopilot/iterations/strategist-N.log |
Strategist agent output per run |
.autopilot/iterations/milestone-setup-N.log |
GSD new-milestone setup per milestone |
Milestone Lifecycle
pending → active → completed
↘ skipped (if stuck)
The Strategist agent writes milestones.json. The loop reads it. The loop updates statuses. The Strategist re-reads it on re-evaluation to avoid duplicates.
Handoff in Evolve Mode
Because milestones span many iterations, handoff.md carries milestone-level context:
- Strategist analysis summary (health scores, key findings)
- Which milestone is active and what phase it's on
- Any cross-milestone learnings (e.g., "this codebase uses X pattern everywhere")
The Strategist Agent
The Strategist is a separate Claude instance that:
- Reads the codebase holistically (CLAUDE.md, README, git log, source structure)
- Scores codebase health across 9 dimensions (1-5)
- Generates concrete, achievable milestones ranked by impact
- Re-evaluates after each milestone (codebase changed — reprioritize)
- Writes results to
.autopilot/milestones.json
Good milestones are concrete ("Add Zod validation to all API route inputs with error response formatting") not vague ("improve validation").
Why Re-Evaluate After Each Milestone?
When a milestone completes:
- Test coverage may now be sufficient → skip a "add tests" milestone
- A refactor may have fixed issues flagged for other milestones
- New technical debt may have been introduced
- The priority order almost certainly changed
Skipping re-evaluation means executing a stale plan. The strategist is cheap compared to the execution time.
Anti-Patterns to Avoid
- Multi-tasking: Doing 3 tasks in one iteration. You'll rush and make mistakes.
- Context hoarding: Reading 50 files "just in case." Read what you need.
- Skipping commits: If it's not committed, it doesn't exist for the next iteration.
- Forgetting handoff: The next iteration starts from zero — handoff.md is its lifeline.
- Over-engineering: The simplest change that satisfies the task is the best change.
- Infinite retries: 2 attempts max, then mark as error and move on.
Subagent Strategy
| Need | Agent | Model |
|---|---|---|
| Understand code/APIs | researcher | haiku (quick) or sonnet (deep) |
| Decompose mission | planner | sonnet |
| Write/edit code | implementer | sonnet |
| Review changes | verifier | sonnet |
| Analyze codebase health | improver | sonnet |
Spawn subagents via the Task tool. They run in isolated contexts and return results.