Agent Teams Workflow

When to Use

Use this skill when coordinating multiple Claude Code agents to implement features in parallel using the Agent Teams feature. Covers:

Feature doc format and stack-aware lifecycle
Three-role separation: test-writer, builder, reviewer (order depends on stack)
File ownership rules to prevent conflicts
Hook-based quality gates (TaskCompleted, TeammateIdle, Stop)
Fast verification for rapid feedback, full verification for completion gates
Progress dashboard (feature-docs/STATUS.md) for zero-context recovery
Stuck detection and time blindness mitigation
Coordination protocol with kickoff prompts
Bootstrap and retrofit prompts for new and existing projects

Defer to other skills for:

git-workflow skill: Branch naming, commit message conventions, PR creation
testing-playwright skill: Frontend E2E test patterns (Playwright-specific)
testing-pytest skill: Python test patterns (pytest-specific)
testing-rust skill: Rust test patterns (cargo test-specific)

This workflow is adapted from Anthropic's "Building a C compiler with a team of parallel Claudes" (Feb 2026). The key insight: the quality of the testing harness determines the quality of the output.

1. Settings Configuration

Add to .claude/settings.json:

{
  "$schema": "https://json.schemastore.org/claude-code-settings.json",
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  },
  "teammateMode": "tmux"
}

Setting	Values	What it does
`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS`	`"1"`	Enables the agent teams feature
`teammateMode`	`"auto"`, `"tmux"`, `"in-process"`	Controls how teammates are displayed

Display modes:

auto (default) — uses split panes if already in tmux, in-process otherwise
tmux — forces split-pane mode; each teammate gets its own tmux pane
in-process — all teammates share the main terminal; use Shift+Down to cycle between them

Override per-session: claude --teammate-mode in-process

2. Core Principles

Verification Oracle (Stack-Dependent)

The workflow uses different verification strategies depending on the stack:

Python/Rust — Tests as Oracle (TDD): The test-writer agent reads feature docs and writes failing tests. The builder agent implements code to make those tests pass. Nobody grades their own homework — the agent that writes tests never writes implementation, and the agent that implements never modifies tests.

Frontend — Interface as Oracle (Build-First): For frontend projects, the user-visible interface is the stable contract — not internal component APIs. The builder implements directly from the feature doc's acceptance criteria. The test-writer then writes Playwright E2E tests that verify the implementation matches the spec. Tests should PASS (not fail). Vibe-coded UIs change constantly — components get restructured, hooks get refactored, state management evolves. Unit tests against internal APIs break with every refactor. But the user-facing behavior (what they click, what they see) is stable. E2E tests verify that stable contract.

In both models, separation of concerns is preserved: the agent that builds never writes tests, and the agent that writes tests never modifies implementation.

Minimal Context Pollution

LLMs degrade as context fills with irrelevant information. Every hook and agent instruction is designed to produce minimal, structured output:

Test results print summary lines only, not full stack traces
Errors use a consistent format: ERROR [CATEGORY]: one-line description
Verbose output goes to agent_logs/, never to stdout
scripts/verify.sh logs full output to agent_logs/ and pipes through tail -10
Agent test commands use quiet reporters (-q, no --reporter=verbose)
Stop hook truncates output to 20 lines

Fast Verification

The Stop hook runs scripts/fast-verify.sh (type check only) on every response where files changed. This catches type errors quickly without running the full suite. The full verify pipeline (scripts/verify.sh) runs only on TaskCompleted.

This mirrors Carlini's --fast mode: quick smoke checks during work, comprehensive validation only at completion gates.

Time Blindness Mitigation

LLMs cannot self-regulate time. The TeammateIdle hook detects features stuck in building/ for over 30 minutes (using file modification time) and warns the user. This prevents agents from spinning indefinitely on hard problems.

Progress Dashboard

Agents start each session with zero context. feature-docs/STATUS.md is updated by every agent after each stage transition. It shows what's in flight, what's blocked, and what's done — enabling any agent to orient quickly.

File Ownership

Feature docs declare which files each feature affects. No agent touches files owned by another in-progress feature. This prevents the problem Carlini identified: agents hitting the same bug, fixing it, and overwriting each other's changes.

Ownership is convention-based (declared in feature doc frontmatter), not technically enforced. Agents must check feature-docs/testing/ and feature-docs/building/ for overlapping affected-files before starting work.

CI as Regression Gate

The TaskCompleted hook runs the full verify pipeline (scripts/verify.sh) before any task can be marked done. An agent cannot ship code that breaks existing tests. This is enforced at the hook level (deterministic) rather than in prompts (probabilistic).

Human-in-the-Loop for Subjective Work

Tests verify functional correctness, but some decisions are subjective. For frontend projects, visual/style work requires human review loops with screenshots. The workflow splits into:

Feature work: Fully autonomous. Human writes spec, agents handle the pipeline (frontend: build → E2E test → review, Python/Rust: test → build → review).
Style work (frontend only): Human-in-the-loop. Agent makes changes, generates screenshots, pauses for human feedback. Approved screenshots become visual regression baselines.

3. Team Lifecycle

Step 1 — Create a Team

One team per feature or work unit. Creates config at ~/.claude/teams/{team-name}/ and task list at ~/.claude/tasks/{team-name}/.

TeamCreate { team_name: "feat-user-auth" }

Step 2 — Spawn Teammates

Use the Agent tool with team_name to add teammates. Each spawned teammate appears in its own tmux pane automatically.

Agent {
  team_name: "feat-user-auth",
  name: "test-writer",
  subagent_type: "test-writer",
  prompt: "Pick up feature-docs/ready/003-user-auth.md",
  mode: "auto"
}

Parameter	Required	Purpose
`team_name`	Yes	Which team this teammate joins
`name`	Yes	Human-readable name for messaging and task assignment
`subagent_type`	Yes	Agent type — custom agents from `.claude/agents/` or built-in types
`prompt`	Yes	The task description / instructions
`mode`	No	Permission mode (`"auto"` for autonomous, `"plan"` for approval)

Step 3 — Coordinate with Tasks

Create structured work items that teammates can claim and track:

TaskCreate {
  subject: "Write failing tests for auth module",
  description: "Read feature doc acceptance criteria, write pytest tests..."
}

Assign and track:

TaskUpdate { taskId: "1", owner: "test-writer", status: "in_progress" }
TaskUpdate { taskId: "2", addBlockedBy: ["1"] }

Step 4 — Communicate

Send direct messages to teammates:

SendMessage {
  type: "message",
  recipient: "test-writer",
  content: "Tests look good. Moving to builder phase.",
  summary: "Tests approved"
}

Broadcast to all (use sparingly — costs scale with team size):

SendMessage {
  type: "broadcast",
  content: "Blocking issue found — stop all work.",
  summary: "Critical blocker found"
}

Step 5 — Shut Down and Clean Up

Gracefully terminate each teammate, then delete the team:

SendMessage {
  type: "shutdown_request",
  recipient: "test-writer",
  content: "All tasks complete, shutting down."
}

After all teammates have shut down:

TeamDelete {}

4. Ideation Phase (Pre-Ready)

Before a feature enters the agent pipeline, it goes through an ideation phase where the human explores, researches, and shapes the idea. Source feature-docs/new-feature.md to start (or resume) the guided workflow.

Ideation happens in feature-docs/ideation/ with one subfolder per feature:

feature-docs/ideation/
  CLAUDE.md               # Auto-discovered guide for all ideation folders
  001-user-auth/
    README.md             # Status tracking + progress log
    code-review.md        # Analysis of existing code to change
    api-research.md       # How other projects solve this
    design-notes.md       # Data flow, component tree, schema
    spike-results.md      # Quick experiments
  002-cart-redesign/
    README.md
    current-analysis.md
    competitor-notes.md

Starting or Resuming Ideation

Source feature-docs/new-feature.md — it handles both cases:

New feature: Asks what you want to build, creates the ideation folder, walks you through validation (code review, research, design), saves artifacts as you go
Resume: Scans for folders with status: in-progress, reads all artifacts, summarises where you left off, continues from open questions

Status Tracking

Each ideation folder's README.md has YAML frontmatter:

---
feature: user-auth
status: in-progress # or: complete, shipped
created: 2025-01-15
---

The ## Progress section tracks dated entries across sessions:

### 2025-01-15 — Initial exploration

- **What we did**: Reviewed existing auth code, identified session management gap
- **Decisions made**: Use httpOnly cookies, not localStorage
- **Open questions**: Which OAuth provider to use later?

### 2025-01-16 — API design

- **What we did**: Designed login/logout endpoints, drafted store structure
- **Decisions made**: Separate auth store from user profile store
- **Open questions**: How to handle token refresh?

What Goes in an Ideation Folder

There are no format rules — use whatever helps you think:

Code reviews — Analysis of existing code the feature will touch
Research notes — API docs, how other projects solve this, trade-offs
Design sketches — Data flow diagrams, component trees, schema changes
Spike results — Quick experiments to validate an approach
Conversation logs — Key decisions and reasoning from Claude sessions

Distilling into a Feature Doc

When the feature is clear enough to write testable acceptance criteria, say "create the feature" during your ideation session. The prompt will:

Read all files in the ideation folder
Synthesise the summary from across all artifacts
Extract testable behaviours as GIVEN/WHEN/THEN acceptance criteria
Identify affected files from code reviews and design notes
Flag gaps (missing error cases, unresolved decisions, no affected files)
Save the final doc to feature-docs/ready/<feature-name>.md
Set ideation-ref in the feature doc frontmatter pointing back to the ideation folder
Update the ideation README status to complete

The ideation folder stays as an archive. Agents never read ideation folders — only the distilled feature doc in ready/. The ideation-ref field lets agents optionally check the ideation folder for additional context.

When the feature later completes the full pipeline (reviewer approves, doc moves to completed/), the coordinator updates the ideation README status from complete to shipped and appends a final progress entry noting pipeline completion. This is handled by the coordinator's "After reviewer approves" checklist in implement-feature.md.

Alternatively, if you already know what you want and want to skip ideation, source feature-docs/new-feature.md and choose "skip to feature doc" when prompted — it handles both paths (ideation and direct creation) from a single entry point.

5. Feature Doc Format

Feature docs live in feature-docs/ with subdirectories for each lifecycle stage. Create this directory structure in your project:

feature-docs/
  ideation/           # Human explores and shapes ideas here
  ready/              # Distilled feature doc goes here
  testing/            # Test-writer moves doc here
  building/           # Builder moves doc here
  review/             # Builder moves doc here when tests pass
  completed/          # Reviewer moves doc here when done

Template

---
title: User Authentication
status: ready
priority: high
depends-on: 004-session-management
affected-files:
  - src/auth/authenticate.ts
  - src/auth/session.ts
  - src/stores/auth-store.ts
  - src/components/login-form.tsx
---

# User Authentication

## Summary

Add email/password login with session management. Users can log in, stay
authenticated across page reloads, and log out.

## Acceptance Criteria

1. GIVEN a valid email and password WHEN `authenticate(email, password)` is called
   THEN it returns a `Session` with a non-null `token` and `expiresAt` > now
2. GIVEN an email with no matching user WHEN `authenticate(email, password)` is
   called THEN it throws `AuthenticationError` with code `"INVALID_CREDENTIALS"`
3. GIVEN `authStore.getState().isAuthenticated` is `true` WHEN `logout()` is called
   THEN `authStore.getState().session` is `null` and the session cookie is cleared
4. GIVEN a session cookie with a valid token WHEN `restoreSession()` is called
   THEN `authStore.getState().isAuthenticated` is `true`
5. GIVEN a session cookie with an expired token WHEN `restoreSession()` is called
   THEN `authStore.getState().session` is `null` and the cookie is cleared

## Edge Cases

- Empty email or password to `authenticate()` — throws `ValidationError` with
  code `"EMPTY_FIELD"` before any network request
- Session cookie with malformed JSON — `restoreSession()` clears the cookie
  silently without throwing

## Out of Scope

- OAuth/social login (separate feature) — do NOT add OAuth types to `Session`
- Do NOT touch `src/api/client.ts` interceptor (has a `TODO: add auth` comment;
  leave as-is to avoid breaking existing API calls)

## Technical Notes

- Session token uses httpOnly cookie, not localStorage
- **Rejected**: localStorage with encryption wrapper — XSS-accessible, no real
  protection. httpOnly cookies are invisible to JS entirely.

Acceptance Criteria Rules

Every acceptance criterion must be:

Testable — can be verified by an automated test
Specific — names exact functions, fields, error types, and return values
Independent — does not depend on other criteria passing first
Complete — covers the happy path, error cases, and edge cases

Vague criteria produce vague tests produce wrong implementations.

Feature Dependencies

Features can declare a dependency on one other feature using the depends-on frontmatter field. The value is the filename stem of the dependency (e.g., 005-user-auth).

One level per doc: Each feature declares only its immediate parent. Feature 006 says depends-on: 005-session-mgmt. Feature 005 says depends-on: 004-data-layer. The full chain (006 → 005 → 004) is resolved dynamically at check time — no feature stores the entire chain.

Recursive resolution: The scripts/check-deps.sh script walks the chain from the target feature all the way down. If ANY dependency in the chain is not in completed/, the feature is BLOCKED and must not be picked up.

Blocking behavior:

In TeammateIdle hooks: blocked features are skipped. The hook continues searching for unblocked work.
In agent pickup (builder/test-writer): agents check dependencies before starting. If blocked, they report to the user and stop.
In implement-feature.md coordinator flow: the pre-flight check warns the user and asks whether to wait or override.

Circular dependency detection: The script tracks visited features and exits with an error if a cycle is found (e.g., A → B → A).

When to use depends-on:

Feature B cannot function without Feature A's code being merged (runtime dependency)
Feature B's acceptance criteria reference outputs from Feature A
Feature B modifies files that Feature A creates (sequential file ownership)

When NOT to use depends-on:

Features that merely share a domain but are independently testable
Priority ordering (use priority: high/medium/low instead)
Features that could run in parallel with non-overlapping files

Vague (agent has to guess)	Precise (agent can write a test)
THEN the login works	THEN `authenticate()` returns a `Session` with non-null `token`
THEN an error is shown	THEN it throws `AuthenticationError` with code `"INVALID_CREDENTIALS"`
THEN the data is saved	THEN `authStore.getState().session` contains the `Session`
THEN the field is removed	THEN the returned object does NOT include a `legacyField` key

6. Agent Roles

Test Writer

Purpose: Produce tests that verify the feature doc's acceptance criteria.

Frontend (build-first):

Reads: Feature doc from feature-docs/testing/ + builder's implementation
Produces: Playwright E2E tests that PASS — no Vitest unit tests
Tests verify the user-visible interface against acceptance criteria
If a test fails, the builder has a bug (report it, don't work around it)
Moves doc: testing/ → review/

Python/Rust (TDD):

Reads: Feature doc from feature-docs/ready/
Produces: Test files that FAIL (all tests must fail before handing off)
Tests import from implementation paths even though files may not exist yet
Moves doc: ready/ → testing/

Shared constraints:

Never writes implementation code — only test files
Each acceptance criterion produces at least one test
Edge cases from the feature doc produce additional tests
Commits tests with test(<scope>): add [failing] tests for <feature-name>

Builder

Purpose: Write implementation code for the feature.

Frontend (build-first):

Reads: Feature doc from feature-docs/ready/
Produces: Implementation code directly from acceptance criteria
Creates the feature branch
Moves doc: ready/ → building/ → testing/

Python/Rust (TDD):

Reads: Feature doc from feature-docs/testing/, failing test files
Produces: Implementation code that makes all tests pass
Moves doc: testing/ → building/ → review/

Shared constraints:

Never modifies test files — if tests are wrong, stop and report to the user
Must run scripts/verify.sh after implementation
Only touches files listed in the feature doc's affected-files
Commits implementation with feat(<scope>): implement <feature-name>

Reviewer

Purpose: Catch what tests cannot — code quality, convention adherence, design system consistency, and qualitative issues.

Maps to: The existing code-reviewer universal agent, extended with agent-teams awareness.

Checks:

Code follows project conventions (CLAUDE.md rules)
No duplicate logic introduced
Error handling is complete
Types are correct and specific (no any, no unwrap in production paths)
Component library used correctly (shadcn for frontend, idiomatic patterns for backend)
Feature doc acceptance criteria all have corresponding tests
Tests actually validate the criteria (not just trivially passing)

Produces: Review report. If issues found, status stays at review. If approved, reviewer moves doc to feature-docs/completed/.

Constraints:

Strictly read-only — never edits implementation or test files
Never uses Bash to modify files (sed -i, echo >, etc.)
Reports issues to the coordinator; the coordinator routes fixes to the appropriate agent
Independence is the reviewer's value — if the reviewer fixes code, it cannot objectively review it

Coordinator

Purpose: Orchestrate the pipeline — scan for work, run pre-flight checks, invoke agents, verify lifecycle compliance between stages, and manage the progress dashboard. The coordinator never writes implementation or test code.

Identity: The main Claude Code session that sources implement-feature.md. Unlike the other roles, the coordinator is not a named agent with restricted tools — it has full tool access by default. These constraints are self-imposed through prompt instructions.

Reads: Feature docs (all directories), STATUS.md, verify output, agent reports

Produces: Team lifecycle management, feature doc lifecycle moves, STATUS.md updates

Allowed operations:

Read, Grep, Glob, and read-only Bash on any file
TeamCreate, Agent, SendMessage, TeamDelete for team lifecycle
TaskCreate/TaskUpdate for tracking work items
sed on feature doc frontmatter (status: field only)
mv to move feature docs between lifecycle directories
Write/Edit on feature-docs/STATUS.md only

Constraints:

Never uses Write, Edit, or sed on files listed in affected-files
Never uses Write, Edit, or sed on test files
Never uses Write, Edit, or sed on any implementation/source file
When code needs fixing, re-invokes the responsible agent with specific error details
When tests are wrong, reports to the user or re-invokes the test-writer

7. Feature Doc Lifecycle

Frontend (Build-First)

Human explores idea      →  (feature-docs/ideation/<name>/)
  └─ Code reviews, research, design notes, spikes
Human distills doc       →  status: ready       (feature-docs/ready/)
Builder picks up         →  status: building    (feature-docs/building/)
  └─ Implements from acceptance criteria on feature branch
Builder finishes         →  status: testing     (feature-docs/testing/)
  └─ All verification passes, implementation complete
Test-writer picks up     →  status: testing     (feature-docs/testing/)
  └─ Writes passing Playwright E2E tests
Test-writer finishes     →  status: review      (feature-docs/review/)
  └─ E2E tests pass, verify clean
Reviewer validates       →  status: done        (feature-docs/completed/)
  └─ Approved by reviewer
Coordinator merges       →  PR created and merged to main
  └─ Returns to main, ready for next feature

Python/Rust (TDD)

Human explores idea      →  (feature-docs/ideation/<name>/)
  └─ Code reviews, research, design notes, spikes
Human distills doc       →  status: ready       (feature-docs/ready/)
Test-writer picks up     →  status: testing     (feature-docs/testing/)
  └─ Failing tests committed on feature branch
Builder picks up         →  status: building    (feature-docs/building/)
  └─ Implements until all tests pass
Builder finishes         →  status: review      (feature-docs/review/)
  └─ All tests + verify pass
Reviewer validates       →  status: done        (feature-docs/completed/)
  └─ Approved by reviewer
Coordinator merges       →  PR created and merged to main
  └─ Returns to main, ready for next feature

Status Transitions

Frontend (Build-First):

From	To	Who	Action
ready	building	builder	Move doc, create branch, implement from spec
building	testing	builder	Move doc, verify passes, implementation done
testing	review	test-writer	Move doc, E2E tests written and passing
review	completed	reviewer	Move doc, approve quality
review	testing	reviewer	Move doc back, E2E test gaps found
review	building	reviewer	Move doc back, implementation issues found

Python/Rust (TDD):

From	To	Who	Action
ready	testing	test-writer	Move doc, write failing tests, commit
testing	building	builder	Move doc, begin implementation
building	testing	builder	BOUNCE: defective tests, create bounce file
building	review	builder	Move doc, all tests pass, verify clean
review	completed	reviewer	Move doc, approve quality
review	building	reviewer	Move doc back, issues found (re-work)

The status field in the feature doc frontmatter and the directory location must stay in sync. Moving the file IS the status transition.

Branch Strategy

Each feature gets its own branch: feat/<feature-name> (following git-workflow skill conventions).

The first agent checks out main and pulls before creating the branch
All agents commit on the same branch
Reviewer reviews the branch
After reviewer approval, the coordinator creates a PR (gh pr create) and merges it (gh pr merge --squash --delete-branch)
The coordinator returns to main (git checkout main && git pull) before the next feature starts
This ensures each new feature branches from the latest main, not from a previous unmerged feature

Naming Convention

Feature doc filenames use a 3-digit numeric prefix: NNN-feature-name.md (e.g., 001-user-auth.md, 002-cart-redesign.md). The prefix is assigned at creation time by running scripts/next-feature-number.sh, which scans all lifecycle directories and ideation folders for existing prefixes and returns the next available number. Ideation folders use the same prefix (e.g., ideation/001-user-auth/). The numeric prefix carries through the entire lifecycle — the same file that starts as ready/001-user-auth.md becomes testing/001-user-auth.md, then building/, review/, and completed/.

This prevents confusion between similarly-named features. 001-user-auth.md can never be mistaken for 002-user-auth-v2.md.

8. Coordination Protocol

Automated Kickoff

Source feature-docs/implement-feature.md to scan ready/ for available features, run pre-flight checks (section completeness, file ownership conflicts, dependency chain), detect the stack, and kick off the first agent (builder for frontend, test-writer for Python/Rust). The TeammateIdle hook handles subsequent handoffs automatically.

Dependency awareness: Before kicking off any feature, the coordinator checks its dependency chain via scripts/check-deps.sh. If the feature has unmet dependencies, the coordinator warns the user and suggests waiting or proceeding with an override. The TeammateIdle hook automatically skips blocked features when scanning for pending work.

Sequential Pipeline — Frontend (Build-First)

# 1. Create team
TeamCreate { team_name: "feat-user-auth" }

# 2. Spawn builder
Agent {
  team_name: "feat-user-auth",
  name: "builder",
  subagent_type: "builder",
  prompt: "Pick up feature-docs/ready/001-user-auth.md",
  mode: "auto"
}

# 3. Wait for builder to finish (TeammateIdle notification)
# 4. Shut down builder
SendMessage { type: "shutdown_request", recipient: "builder" }

# 5. Spawn test-writer for E2E tests
Agent {
  team_name: "feat-user-auth",
  name: "test-writer",
  subagent_type: "test-writer",
  prompt: "Pick up feature-docs/testing/001-user-auth.md",
  mode: "auto"
}

# 6. Wait for test-writer to finish
# 7. Shut down test-writer, spawn reviewer
SendMessage { type: "shutdown_request", recipient: "test-writer" }

Agent {
  team_name: "feat-user-auth",
  name: "reviewer",
  subagent_type: "code-reviewer",
  prompt: "Review feature-docs/review/001-user-auth.md",
  mode: "auto"
}

# 8. Wait for reviewer, then merge and clean up
# Create PR and merge to main
gh pr create --base main --head "feat/user-auth" --title "feat(auth): user authentication" --body "..."
gh pr merge --squash --delete-branch

# Return to main
git checkout main
git pull origin main

# Clean up the team
SendMessage { type: "shutdown_request", recipient: "reviewer" }
TeamDelete {}

Sequential Pipeline — Python/Rust (TDD)

# 1. Create team
TeamCreate { team_name: "feat-config" }

# 2. Spawn test-writer (writes failing tests first)
Agent {
  team_name: "feat-config",
  name: "test-writer",
  subagent_type: "test-writer",
  prompt: "Pick up feature-docs/ready/003-config.md",
  mode: "auto"
}

# 3. Wait for test-writer to finish (TeammateIdle notification)
# 4. Shut down test-writer
SendMessage { type: "shutdown_request", recipient: "test-writer" }

# 5. Spawn builder
Agent {
  team_name: "feat-config",
  name: "builder",
  subagent_type: "builder",
  prompt: "Pick up feature-docs/testing/003-config.md",
  mode: "auto"
}

# 6. Wait for builder to finish
# 7. Shut down builder, spawn reviewer
SendMessage { type: "shutdown_request", recipient: "builder" }

Agent {
  team_name: "feat-config",
  name: "reviewer",
  subagent_type: "code-reviewer",
  prompt: "Review feature-docs/review/003-config.md",
  mode: "auto"
}

# 8. Wait for reviewer, then merge and clean up
# Create PR and merge to main
gh pr create --base main --head "feat/config" --title "feat(config): configuration system" --body "..."
gh pr merge --squash --delete-branch

# Return to main
git checkout main
git pull origin main

# Clean up the team
SendMessage { type: "shutdown_request", recipient: "reviewer" }
TeamDelete {}

Python/Rust: Test Bounce-Back (builder → test-writer → builder)

If the builder detects defective tests (wrong assertions, missing pytest.raises, tests that contradict the feature doc), it moves the feature doc back to testing/, creates a bounce file (<name>.bounce.md), and exits. The coordinator detects this and re-invokes the test-writer in fix mode.

Detection: After the builder finishes (TeammateIdle or manual check), check whether it bounced — the feature doc will be in testing/ (not review/):

ls feature-docs/testing/<filename>.bounce.md

If a bounce file exists:

Check bounce count: Read the bounce-count from the feature doc frontmatter. If it is 3 or higher, escalate to the user — the problem is likely in the acceptance criteria, not test mechanics.

Re-invoke the test-writer in fix mode:

Agent {
  team_name: "feat-<feature-name>",
  name: "test-writer",
  subagent_type: "test-writer",
  prompt: "Fix defective tests per feature-docs/testing/<filename>.bounce.md",
  mode: "auto"
}

Wait for the test-writer to complete, then re-invoke the builder:

SendMessage { type: "shutdown_request", recipient: "test-writer" }

Agent {
  team_name: "feat-<feature-name>",
  name: "builder",
  subagent_type: "builder",
  prompt: "Pick up feature-docs/testing/<filename>.md — tests have been fixed after bounce-back.",
  mode: "auto"
}

Circuit breaker: When bounce-count reaches 3, escalate to the user. Do not re-invoke agents automatically — the issue likely requires revising the feature doc's acceptance criteria.

Concurrency Rules

Same-role parallelism is allowed. The coordinator may launch multiple Agent calls simultaneously, each working on a different piece or a different feature.
Cross-role parallelism is forbidden. Builders and testers must never run at the same time. Complete ALL agents of one role before starting the next role.
Clean shutdown between roles. Send shutdown_request to each teammate and verify all agents of the current role have fully stopped before spawning the next role. Teammates finish their current turn before exiting.

Parallel Workflow (Multiple Features)

For multiple features in parallel, ensure no affected-files overlap. Use the stack-appropriate first agent (builder for frontend, test-writer for Python/Rust). If features share files, run them sequentially to avoid conflicts.

Parallel Investigation

Spawn multiple teammates to explore in parallel:

TeamCreate { team_name: "investigate-perf" }

Agent {
  team_name: "investigate-perf",
  name: "db-investigator",
  subagent_type: "general-purpose",
  prompt: "Investigate database query performance in src/db/",
  mode: "auto"
}

Agent {
  team_name: "investigate-perf",
  name: "api-investigator",
  subagent_type: "general-purpose",
  prompt: "Investigate API endpoint latency in src/api/",
  mode: "auto"
}

TeammateIdle Hook

When a teammate finishes work and goes idle, the TeammateIdle hook scans feature-docs/ for pending work and logs what it finds. The hook always exits 0, allowing the agent session to terminate cleanly. The coordinator is responsible for launching fresh agent sessions for the next role.

Frontend (build-first) scan priority:

feature-docs/testing/ — Needs test-writer for E2E tests
feature-docs/ready/ — Needs builder to implement
feature-docs/review/ — Needs reviewer

Python/Rust (TDD) scan priority:

feature-docs/testing/ — Failing tests exist, needs a builder
feature-docs/ready/ — Feature doc waiting, needs a test-writer
feature-docs/review/ — Implementation done, needs a reviewer

The hook logs pending work to stderr for the coordinator's awareness, but does not redirect the idle agent. This prevents finished agents from lingering and interfering with the next role's file changes.

TaskCompleted Hook

When any teammate tries to mark a task as done, the TaskCompleted hook runs two checks:

1. Lifecycle compliance — Scans all feature docs in ready/, testing/, building/, review/, and completed/. For each doc with a status: field, verifies the value matches the directory name. If any feature doc is in the wrong directory (e.g., still in ready/ when it should be in testing/), the task is blocked. This prevents agents from skipping the doc-move step.

2. Full verify pipeline:

Type checking (tsc / mypy / cargo check)
Linting (eslint / ruff / clippy)
Tests (vitest / pytest / cargo test)

If either check fails, the task cannot be marked done. The agent sees the error output and must fix the issue before trying again.

9. File Ownership Rules

Claiming Files

When an agent picks up a feature doc, the affected-files list in the frontmatter declares which files that agent may modify. Before starting:

Read all feature docs in feature-docs/testing/ and feature-docs/building/
Collect their affected-files lists
Check for overlap with the current feature's affected-files
If overlap exists, report to the user and wait — do not proceed

Resolving Conflicts

If two features must touch the same file:

Run them sequentially (feature A completes fully before feature B starts)
Or split the shared file into separate modules first

Test File Ownership

Test files are owned exclusively by the test-writer. The builder must never modify them.

Python/Rust: If a test is wrong, the builder creates a bounce file (<name>.bounce.md) in feature-docs/testing/ describing the defects, moves the feature doc back to testing/, and stops. The coordinator re-invokes the test-writer in fix mode. The builder never modifies test files or writes production code to accommodate a defective test.

Frontend: E2E test files are created by the test-writer after the builder finishes. The builder has no test files to modify.

10. Style Work (Frontend Only)

Style refinement cannot be fully automated because "looks right" is subjective.

Style Doc Format

Style docs follow the same template as feature docs but live in styles/ instead of feature-docs/:

---
title: Dashboard Cards Redesign
status: ready
affected-files:
  - src/components/dashboard/stat-card.tsx
  - src/components/dashboard/chart-card.tsx
---

# Dashboard Cards Redesign

## Visual Direction

- Cards should use subtle shadows instead of borders
- Stat numbers should use the display font at 2xl
- Charts should fill the card width with 16px padding

## Reference

- See designs in figma: [link]
- Similar to the pattern in src/components/existing-card.tsx

Iteration Loop

Human writes a style doc with visual direction
Style agent applies changes and generates screenshots to styles/reviews/<name>/iteration-N/
Agent sets status to awaiting-review and stops
Human reviews screenshots, writes feedback in the style doc
Agent reads feedback, applies another iteration
When human approves, screenshots become Playwright visual regression baselines

Approved screenshots are locked in as automated tests. Future agents cannot drift from the approved design without failing a visual regression test.

11. Hook Configuration

TaskCompleted

Blocks task completion until lifecycle compliance and the full verify pipeline pass.

{
  "event": "TaskCompleted",
  "command": "bash scripts/task-completed.sh"
}

The script runs two checks. First, it scans feature docs for status/directory mismatches (e.g., a doc in ready/ with status: testing) and blocks if any are found. Second, it runs scripts/verify.sh (full pipeline) and blocks (exit 2) on any failure. Output is truncated to 30 lines to avoid context pollution. Verbose logs are available in agent_logs/ for debugging.

Lifecycle-aware: For Python/Rust during the testing stage, only lifecycle compliance is checked — the verify pipeline is skipped because tests are expected to fail. For frontend, all stages run full verification (no stage has expected failures in the build-first flow).

TeammateIdle

Logs pending work for the coordinator's awareness when a teammate goes idle.

{
  "event": "TeammateIdle",
  "command": "bash scripts/teammate-idle.sh"
}

The script first checks for stuck features (in building/ for over 30 minutes) and warns if found. Then it scans feature-docs/ directories and logs any pending work to stderr. Always exits 0 to let the agent session terminate cleanly — the coordinator launches fresh sessions for the next role.

Stop (Fast Verify on Change)

Runs fast verification (type check only) after each Claude response when files have changed. Full verification is deferred to TaskCompleted to avoid spending agent time on the full suite during iterative development.

{
  "event": "Stop",
  "command": "bash scripts/stop-hook.sh"
}

The script checks git diff and git ls-files for modifications. If the working tree is clean, it exits 0 (skips verify). If files have changed, it runs scripts/fast-verify.sh (type check only) for quick feedback. If no fast-verify script exists, it falls back to scripts/verify.sh. It reads stop_hook_active from stdin to prevent recursive loops. Output is truncated to 20 lines.

Lifecycle-aware: For Python/Rust during the testing stage, verification is skipped entirely because test-writer code references unimplemented APIs that will always fail type checking. For frontend, verification runs at all stages.

Branch Protection

The guard-bash.sh PreToolUse hook blocks direct commits on main/master, forcing agents to work on feature branches. This complements the branch-per-feature strategy described in the coordination protocol.

12. Interaction Controls

tmux Mode

Click into any teammate's pane to interact directly
Each pane shows the teammate's full terminal session
Standard tmux controls for pane management

in-process Mode

Shift+Down — cycle through active teammates
Enter — view a teammate's full session
Escape — interrupt current turn
Ctrl+T — toggle task list view
Type to send messages to the currently visible teammate

13. Bootstrap Prompt (New Project)

Use this prompt to set up the agent teams workflow in a new project:

Set up the agent teams workflow for this project:

1. Create the feature-docs/ directory structure:
   feature-docs/ideation/, feature-docs/ready/, feature-docs/testing/,
   feature-docs/building/, feature-docs/review/, feature-docs/completed/

2. Create an agent_logs/ directory for verbose output
   Add agent_logs/ to .gitignore

3. Verify that scripts/verify.sh and scripts/fast-verify.sh both exist:
   - verify.sh: full pipeline (type check + lint + tests) with output to agent_logs/
   - fast-verify.sh: type check only for quick feedback

4. Verify that .claude/settings.json includes TaskCompleted,
   TeammateIdle, and Stop hooks

5. Create a sample feature doc in feature-docs/ready/ based on the
   Feature Doc Format section in feature-docs/CLAUDE.md

6. Create an empty feature-docs/STATUS.md for the progress dashboard

7. Run the full verify pipeline once to confirm everything works

Report what you created and any issues found.

14. Retrofit Prompt (Existing Project)

Use this prompt to add the workflow to a project that already has code and tests:

Retrofit the agent teams workflow into this existing project:

1. Discovery — report the following:
   - Package manager and framework
   - Test runner and test directory structure
   - Component library and state management
   - Directory structure and naming conventions
   - Existing .claude/ configuration

2. Create the feature-docs/ directory structure alongside existing code

3. Verify scripts/verify.sh works with the existing toolchain:
   - Type checking command
   - Lint command
   - Test command

4. Check .claude/settings.json for existing hooks and add
   TaskCompleted and TeammateIdle hooks without replacing
   existing configuration

5. Identify migration needs:
   - Test files not in a separate directory (need restructuring?)
   - Missing test coverage for critical paths
   - Files without clear ownership boundaries

Write a discovery report to agent_logs/discovery-report.md and
list any recommended changes (without acting on them).

15. Token Cost Expectations

Agent teams use roughly 5x the tokens of a single session per teammate. A team of 3 (test-writer, builder, reviewer) working on a single feature uses approximately 15x a normal session's tokens. This is justified when:

The feature has clear, testable acceptance criteria
Files can be cleanly owned by one feature at a time
Quality gates (hooks) prevent wasted rework
The alternative is sequential context degradation in a single long session

For simple features (one file, clear spec), use a single Claude Code session. Reserve agent teams for features touching multiple files across stores, components, services, and tests.

16. Limitations

One team per session — a lead can only manage one team at a time
No nested teams — teammates cannot spawn their own teams
No session resumption for in-process teammates
Higher token costs than single sessions (each teammate has its own context)
Split panes require tmux or iTerm2 with it2 CLI
Shutdown can be slow — teammates finish their current turn before exiting

Anti-Patterns

Anti-Pattern	Why It Fails	Fix
Builder modifies test files	Grading your own homework — tests lose independence as the oracle	Builder must never touch files created by test-writer
Builder works around defective tests	Production code is contorted to satisfy wrong assertions — e.g., returning error strings instead of raising exceptions because the test lacks `pytest.raises`	Builder runs Test Quality Audit before implementation; if tests are defective, STOP and create a bounce file — never write code to accommodate a bad test
Builder writes code to satisfy weak assertions	A test asserts truthiness (`is not None`) instead of specific values; builder writes a minimal stub that returns a placeholder	Builder's bright-line rule: if idiomatic code written without seeing the tests would not satisfy the assertion, the test is defective — bounce back
Skipping the test-writer step	No independent verification — builder's code is unchecked against the spec	Frontend: test-writer writes E2E tests after build. Python/Rust: test-writer writes failing tests before build
No file ownership declaration	Two agents edit the same file; merge conflicts and lost work	Feature docs must list `affected-files`; check for overlaps
Running parallel features on same branch	Merge conflicts, unclear ownership, broken bisect history	One branch per feature; merge to main sequentially
Passing full test output to agents	Context pollution fills the window with stack traces	Pass summary only: X passed, Y failed, first failure message
Feature doc without testable criteria	Test-writer cannot produce meaningful tests; builder has no target	Every acceptance criterion must use GIVEN/WHEN/THEN format
Skipping the reviewer step	Qualitative issues (conventions, duplication, design) go undetected	Reviewer validates what tests cannot catch
Using agent teams for trivial changes	15x token cost for a one-line fix is wasteful	Single session for changes touching fewer than 3 files
Running full test suite on every save	Agent wastes time waiting for slow tests during iteration	Use fast-verify.sh (type check only) on Stop; full suite on TaskCompleted
Tests that check truthiness not values	Wrong implementation passes — `toBeTruthy()` accepts any non-null	Assert specific return values, error types, and state changes
No progress dashboard	Agents start with zero context and waste time re-discovering state	Update `feature-docs/STATUS.md` after every stage transition
Ignoring stuck features	Agent spins for hours on a hard problem without human awareness	TeammateIdle warns after 30 minutes in building/; check agent_logs/
Skipping feature doc lifecycle steps	Next agent never finds the feature doc; pipeline stalls indefinitely	`task-completed.sh` enforces status/directory sync; Completion Gate checklist in agent definitions
Coordinator edits implementation or test files	Violates role separation — coordinator and agent edit the same files, causing conflicts and undermining the test-as-oracle principle	Coordinator re-invokes the responsible agent with specific error details; never uses Write/Edit/sed on code
Coordinator fixes follow-up issues directly	Bypasses TDD — no failing test, no builder, no review; defeats the entire workflow even for "small" fixes	Route follow-ups through the full pipeline: test-writer → builder → reviewer; create a new feature doc or amend the existing one
Unbounded review → building loop	Builder and reviewer cycle indefinitely, burning tokens on issues the builder cannot resolve alone	Auto-loop up to 3 cycles; after 3, escalate to the user with remaining issues
Launching next agent before current one finishes	Both agents edit the same feature's files simultaneously, causing conflicts and lost work	Per-feature sequential: wait for each agent to complete before launching the next; cross-feature parallelism is fine with non-overlapping `affected-files`
Agent stays active after completing its stage	Idle agent reacts to next role's file changes, causing conflicts (e.g., builder "fixes" test-writer's new tests)	Exit Protocol in agent definitions: output report then STOP; TeammateIdle exits 0 to let agents die; coordinator launches fresh sessions
Reviewer fixes code directly	Defeats independence — reviewer can't objectively review code it wrote; bypasses TDD pipeline	Reviewer reports issues only; coordinator routes to test-writer (for test gaps) or builder (for implementation issues)
Ideation README never updated after pipeline	Feature appears incomplete in ideation folder; scanning for shipped features requires reading `completed/` instead of ideation metadata	Coordinator updates ideation README to `shipped` in "After reviewer approves" step
Feature docs without numeric prefix	Similarly-named features (user-auth.md vs user-auth-v2.md) cause agents to read the wrong doc from completed/ or other directories	Always use `scripts/next-feature-number.sh` to get a unique NNN- prefix at creation time
Running verify on test-writer output (Python/Rust)	Type errors on unresolved imports fire on every response; test failures block task completion	Hooks detect `testing` stage and stack via `lifecycle-stage.sh`; skip verification for Python/Rust TDD but not frontend build-first
Writing Vitest unit tests in frontend workflow	Unit tests break on every component refactor; internal APIs are unstable in vibe-coded UIs	Frontend test-writer writes Playwright E2E only; user-visible behavior is the stable contract
Picking up a feature with unmet dependencies	Implementation builds on code that doesn't exist yet; tests reference missing APIs; entire feature may need rework	Run `scripts/check-deps.sh` before pickup; agents and hooks check automatically
Deep dependency chains declared in a single doc	Stale chain data if intermediate features change; maintenance burden grows with chain length	Each doc declares only its immediate parent (`depends-on: NNN-name`); the script resolves the full chain dynamically from `completed/`
Circular dependencies between features	Pipeline deadlock — neither feature can proceed because each waits for the other	`check-deps.sh` detects cycles and exits with error; redesign features to break the cycle
Spawning agents without TeamCreate	No team lifecycle, no SendMessage, no shared task tracking — agents run in isolation	Create a team first with `TeamCreate`, spawn agents with `Agent` tool, coordinate with `SendMessage`
Forgetting TeamDelete after pipeline	Orphaned team config persists in `~/.claude/teams/`; stale task lists accumulate	Always `shutdown_request` all teammates then `TeamDelete` after the pipeline completes
Starting next feature while on a feature branch	New feature branches from previous feature instead of main; creates dependency stacking where features can't be merged independently	Pre-flight check in implement-feature.md verifies `git rev-parse --abbrev-ref HEAD` returns `main`; refuse to start until on main
Skipping merge step after reviewer approval	Feature branch sits unmerged; next feature branches from stale state; causes cascading dependency chain across features	Coordinator creates PR with `gh pr create`, merges with `gh pr merge --squash --delete-branch`, then returns to main