agent-teams
Agent Teams Orchestrator
Critical Rules
Use Claude Code's NATIVE agent teams feature. Agent teams are a first-class capability built into Claude Code — do NOT simulate them with bash subprocesses or CLI commands.
- DO: Directly describe the team you want to create in natural language. The platform spawns teammates natively.
- DO NOT: Use
bash,subprocess, orclaudeCLI to create parallel agents. - DO NOT: Simulate teams by running multiple shell commands.
When to Use Agent Teams vs Subagents vs Single Session
Agent teams add coordination overhead and use significantly more tokens. Choose wisely:
| Scenario | Use | Why |
|---|---|---|
| Teammates need to share findings and challenge each other | Agent Teams | Peer-to-peer messaging, shared task list |
| Quick, focused workers that report back independently | Subagents | Lower overhead, no inter-agent coordination needed |
| Sequential tasks, same-file edits, many dependencies | Single Session | No parallelism benefit |
Key difference: Subagents can only report back to the parent. Agent team teammates message each other directly, self-claim tasks, and self-coordinate.
Decision rule: If workers need to communicate with each other → Agent Teams. If not → Subagents or Single Session.
How Agent Teams Work
- Team lead: Your main session. Coordinates work, assigns tasks, synthesizes results.
- Teammates: Separate Claude Code instances, each with its own context window. They can read/write files, run commands, and message other teammates and the lead.
- Shared task list: Central work items with dependency tracking. Tasks auto-unblock when dependencies finish. Teammates self-claim available tasks.
- Mailbox: Inter-agent messaging. Lead ↔ teammates, teammate ↔ teammate.
Critical: Teammates do NOT inherit the lead's conversation history. Whatever context they need, the lead must provide in the spawn prompt. Be generous with initial briefing.
Execution Workflow
Step 1: Assess the Task
Before creating a team, ask:
- Can this be done in a single session? → Do it yourself.
- Does it benefit from parallel work? → Continue.
- Do workers need to communicate with each other? → Agent Teams. If not → Subagents.
- Are the subtasks naturally independent (different files/modules)? → Good fit for teams.
Best use cases for agent teams:
- Multi-perspective code review (security + performance + maintainability)
- Cross-layer coordination (frontend + backend + tests, each owned by a different teammate)
- Competing hypotheses / research from different angles
- Data-parallel work (each teammate handles a segment)
- Large feature development with naturally separable components
Poor fit:
- Sequential tasks with strong dependencies
- Multiple teammates editing the same file (causes conflicts)
- Simple tasks a single session handles fine
Step 2: Design the Team
Start with research, then implement. Agent teams work best when they begin with investigation, review, and analysis — then move to implementation if needed.
Principles from the C compiler project:
- Task decomposition is everything: Break work into modules that can be independently developed and tested. Each teammate should own distinct files/directories.
- Write clear, high-quality acceptance criteria: Agents self-orient from scratch with no prior context. Tests and clear specs keep them on track without human oversight.
- Minimize team size: Each teammate = separate context window = more tokens. Use the fewest teammates that cover the work.
- Avoid file conflicts: Assign different files/directories to different teammates. Never have two teammates edit the same file.
- Provide rich context in spawn prompts: Include specific file paths, relevant standards, expected output format. Vague prompts produce vague results.
Select the right team structure based on task type:
IMPORTANT: Every team MUST include a
challenge_agent. This role is not optional — it improves quality by challenging assumptions and finding blind spots that specialists miss.
For Code Review / Audit
Roles (4-5 teammates):
security_auditor: OWASP Top 10, injection, auth/authz, dependency vulnerabilitiesperformance_engineer: Complexity, N+1 queries, memory, caching, bundle sizemaintainability_expert: SOLID, DRY, naming, error handling, test coveragechallenge_agent: Dedicated skeptic — challenges assumptions, finds blind spots, questions each reviewer's conclusions (REQUIRED in every team)qa_specialist(optional): Edge cases, race conditions, error paths
Coordination: Work independently → challenge_agent questions everyone's findings → cross-check each other's findings → consolidate into single report.
The Challenge Agent Role
The challenge_agent (or devil_advisor) is a dedicated skeptic that improves team quality:
Responsibilities:
- Question assumptions made by other teammates
- Identify blind spots and edge cases others missed
- Challenge design decisions from different angles
- Act as a "red team" to stress-test conclusions
Why it matters:
- Specialists focus on their domain → miss cross-cutting concerns
- A dedicated skeptic catches what everyone else overlooked
- Forces explicit reasoning instead of implicit assumptions
Example prompt:
challenge_agent: "Review all findings from security_auditor, performance_engineer, and maintainability_expert. Question their conclusions: What assumptions are they making? What edge cases did they miss? What would make this code fail in production? Output: challenges.md with ranked concerns by severity."
For Feature Development
Scale roles by complexity:
Simple (4 roles): Single-page, basic CRUD
fullstack_developer: End-to-end implementationui_reviewer: UX validation, interaction flowschallenge_agent: Challenges design decisions, identifies edge cases and risks (REQUIRED in every team)code_reviewer: Quality, best practices, security
Standard (5-6 roles): Multi-component, API integration
system_architect: API contracts, data models, service boundariesfrontend_specialist: Components, state management, accessibilitybackend_specialist: Business logic, validation, authorizationintegration_tester: Cross-layer verification, E2E scenarioschallenge_agent: Challenges design decisions, identifies edge cases and risks (REQUIRED in every team)code_reviewer: Quality, patterns, security
Complex (6-8 roles): Full-stack module, database design, deployment
- Add
database_engineer,devops_engineerto Standard roles
Coordination: Architect defines contracts → Specialists implement in parallel → Testers verify → Reviewer ensures quality → Consolidate.
For Debugging / Investigation
Roles (3-4 teammates):
log_analyst: Trace reconstruction, timeline, patternscode_auditor: Static analysis, state consistency, root cause hypothesesreproduction_lead: Minimal repro, environment simulationchallenge_agent: Challenges each investigator's hypothesis, pushes for more evidence (REQUIRED in every team)
Coordination: Each investigates a different hypothesis → challenge_agent questions conclusions → debate and disprove → converge on root cause.
For Research / Evaluation
Roles (3-4 teammates):
advocate_a: Deep dive into option A — strengths, weaknesses, real-world examplesadvocate_b: Deep dive into option Bsynthesizer: Objective comparison, scoring matrix, recommendationchallenge_agent: Challenges each advocate's reasoning, questions assumptions (REQUIRED in every team)
Coordination: Advocates research independently → challenge_agent questions conclusions → synthesizer produces final analysis.
Step 3: Create the Team
Tell Claude what you want in natural language. Be specific about roles and scope.
Use this pattern:
[Brief analysis of why a team is needed]
Create an agent team to [objective].
Spawn [N] teammates:
- [role_name]: "[Goal sentence]. Focus: [specific areas with file paths if applicable]. Output: [deliverable file/format]."
- [role_name]: "[Goal sentence]. Focus: [specific areas]. Output: [deliverable file/format]."
...
Coordination:
- [How teammates collaborate and cross-check]
- [File ownership: who writes where]
- [How results are consolidated]
Wait for teammates to finish.
Role definition quality matters:
Good:
security_auditor: "Audit src/auth/ for security vulnerabilities. Focus: JWT token storage in src/auth/token.ts, session management in src/auth/session.ts, CSRF protection, password hashing, rate limiting. Output: security_review.md with findings sorted by CVSS severity."
Bad:
security_guy: "Check security stuff"
Step 4: Coordinate the Team
As lead agent, manage using:
| Action | How | When |
|---|---|---|
| Direct a teammate | Ask [teammate] |
Assign specific work or ask for status |
| Message all | Broadcast |
Share updates affecting everyone |
| Wait | Wait for teammates to finish |
Before consolidating results — always |
| Clean up | Clean up the team |
When done — mandatory, no exceptions |
Assignment modes:
- Lead assigns: Explicitly assign tasks with
Ask [teammate]. Use for sequenced or specialized work. - Self-claiming: Teammates auto-pick unblocked tasks from the shared list. Use for independent, well-defined tasks.
Optional controls:
- Plan approval: Require teammates to get approval before making changes. Use for high-risk modifications (database schema, auth logic).
- Delegate mode: Teammates work more autonomously with less lead oversight. Use when objectives are clear and independent.
Step 5: Consolidate & Clean Up
After all teammates finish:
- Collect all deliverables from output files
- Cross-reference findings between teammates — look for conflicts, gaps, and duplicates
- Synthesize into a unified summary
- Report to the user with key findings and recommendations
- Clean up the team — MANDATORY
Lessons from the C Compiler Project
These patterns come from Anthropic's experiment where 16 agents built a 100,000-line C compiler:
-
When tasks are naturally independent, parallelization is trivial: Each agent picks a different failing test or module to work on. The shared task list handles coordination.
-
Giant monolithic tasks cause agents to get stuck: Unlike hundreds of independent tests, a single massive task can't be parallelized. Break big deliverables into smaller, testable chunks.
-
Tests are the primary coordination mechanism: With high-quality tests, agents stay on track without human oversight. New commits can't break existing code if CI enforces it.
-
Agents have no prior context on each session: Each agent starts fresh. Write clear README/CLAUDE.md files so agents can self-orient quickly. Include file paths, architecture notes, and conventions.
-
Agents waste time without constraints: Without guardrails, agents will spend hours running tests or exploring tangents. Provide fast feedback loops (quick test samples, clear progress indicators).
-
File-based coordination works: Agents "lock" tasks by writing files. Git merge conflicts serve as tiebreakers. Save all results to files since session state doesn't persist.
Hooks for Agent Teams
Use hooks to enforce rules automatically:
- TeammateIdle: Runs when a teammate is about to go idle. Exit with code 2 to send feedback and keep them working.
- TaskCompleted: Runs when a task is being marked complete. Exit with code 2 to prevent completion and send feedback (e.g., run additional validation).
Critical Reminders
- No session resume:
/resumedoesn't restore teammates → Always save results to files - File conflicts: Assign different files to different teammates — this is the biggest pitfall
- Token cost: Each teammate = separate context window → Keep team size minimal
- Context: Teammates don't inherit lead's conversation history → Provide full context in spawn prompts
- Always wait: Don't consolidate until all teammates finish
- Always clean up: Terminate the team when done — no exceptions
Full Example
User: "Use agent-teams to review the authentication module for security, performance, and code quality."
Response:
This task benefits from parallel specialized review — three independent perspectives on the same code, with cross-checking. Creating a review team.
Create an agent team to review the authentication module.
Spawn three reviewers:
- security_auditor: "Audit src/auth/ for security vulnerabilities. Focus: JWT handling in src/auth/token.ts, password hashing in src/auth/crypto.ts, session management in src/auth/session.ts, CSRF protection, rate limiting. Output: Write findings with severity ratings to reviews/security_review.md."
- performance_engineer: "Analyze src/auth/ for performance issues. Focus: N+1 queries in src/auth/queries.ts, token validation overhead, caching opportunities, connection pooling in src/auth/db.ts. Output: Write optimization recommendations to reviews/performance_review.md."
- maintainability_expert: "Review src/auth/ for code quality issues. Focus: SOLID violations, error handling gaps, naming consistency, test coverage in tests/auth/. Output: Write refactoring suggestions to reviews/maintainability_review.md."
Coordination:
- Each reviewer works independently on their domain — no file conflicts since all write to separate output files
- Security auditor cross-checks performance recommendations for security implications
- Performance engineer validates security suggestions won't create bottlenecks
- All findings consolidated into reviews/auth_review_summary.md
Wait for teammates to finish.
Troubleshooting
Issue: Teammates not responding
Cause: Context window limit or message delivery delay Solution:
- Verify teammate is still running (check task status)
- Resend message with more explicit instructions
- If stuck, terminate and respawn the teammate
Issue: File conflicts between teammates
Cause: Multiple teammates assigned to same files Solution:
- Immediately stop the team
- Redistribute file ownership so each file has exactly one owner
- Resume with clear ownership boundaries
Issue: Teammate produces generic/low-quality output
Cause: Vague role definition in spawn prompt Solution:
- Be specific about: file paths, focus areas, output format, quality bar
- Example: "Write security findings to reviews/sec.md with CVSS severity ratings"
Issue: Team never finishes (infinite loop)
Cause: Unclear completion criteria or circular dependencies Solution:
- Define explicit exit conditions in coordination section
- Set maximum task count or time limit
- If stuck, manually intervene with "Please summarize your progress and stop"
Issue: Results not consolidating properly
Cause: Teammates wrote to wrong files or formats differ Solution:
- Always specify exact output file paths and format in role definitions
- Lead agent should create consolidation template before spawning team
- Review intermediate outputs before final consolidation
More from jochenyang/jochen-ai-rules
ui-ux-pro-max
UI/UX design-system reasoning and UX quality audit skill. Use when user needs style direction, palette/typography selection, UX review, or design optimization before implementation. Do NOT use for backend logic or database design.
24devops-engineer
CI/CD pipeline design, containerization, and infrastructure management. Handles Docker, Kubernetes, monitoring setup (Prometheus/Grafana), and infrastructure-as-code (Terraform/Pulumi). Use when user asks to deploy, configure CI/CD, set up Docker/K8s, or manage infrastructure.
19handoff
Create and resume structured manual session handoffs for long-running development work. Use when approaching context limits, before manual reset, before switching models or IDEs, after a milestone, or when automatic compact would lose important implementation state.
1reflect
Review current conversation, analyze tasks, errors, and user feedback, extract learning opportunities for skill improvement. Use when user says "reflect", "review session", "what did we learn", "session summary", or after completing a complex task.
1developer
Comprehensive full-stack development for web, mobile, and game projects. Handles frontend (React/Vue/Angular), backend (Node.js/Python/Go/Java), mobile (Flutter/React Native/Swift/Kotlin), and game development (Unity/Unreal/Godot). Use when user asks to build, create, develop, implement, debug, or fix any web, mobile, or game project. Do NOT use for design-only tasks (use frontend-design or ui-ux-pro-max instead).
1mcp-builder
MCP server development for AI agents. Designs tool schemas, implements Python/TypeScript servers, creates evaluation tests. Use when user asks to build MCP server, create tool integration, or develop Claude plugins. Supports GitHub/Notion/Slack integrations.
1