pipeline
Pipeline
Value: Flow -- continuous, gated progression from slice to shipped code eliminates idle coordination overhead and makes quality evidence accumulate automatically.
Purpose
Orchestrates the build phase end-to-end: takes vertical slices from the queue, dispatches TDD pairs, runs full-team code review before push, enforces mutation testing and CI gates, and merges or escalates. The pipeline controller handles all operational tasks (running tests, pushing code, managing git) so team member agents focus exclusively on creative work.
Practices
Slice Queue Management
The pipeline reads vertical slices from event model output and maintains their
state in .factory/slice-queue.json. See references/slice-queue.md for the
full queue schema and operations.
Each slice carries a context block with enriched metadata for downstream
pipeline stages: acceptance_scenarios (full-stack GWT scenarios exercising
the application boundary, human-owned) and domain_scenarios (event-model
unit scenarios for commands and projections), an event model source path (so
the TDD pair can consult the original model), related slice IDs for cross-slice
dependency tracking, domain types referenced by the slice (enabling
pre-implementation type discovery), and UI components referenced (triggering
conditional context gathering when the slice touches the interface layer). See
references/slice-queue.md for the full two-tier scenario schema.
Vertical slice integrity: Every slice MUST deliver behavior through a user-facing boundary (UI for a human) or an external-API boundary (for third-party software consumers). Internal APIs that serve the application's own front-end are NOT external — slices touching them still require UI acceptance tests. There are no "domain-only," "infrastructure-only," or "backend-only" slices — infrastructure exists to serve functionality. If a slice does not include a user-facing or external-API boundary, it is not a vertical slice and MUST NOT enter the queue. Any rare deviation (e.g., backend restructuring with no net-new acceptance tests) requires explicit human approval before the slice enters the queue.
Ordering strategy: Walking skeleton first (the thinnest end-to-end path), then by dependency graph (slices whose predecessors are complete), then by priority. The first slice in any project must always be a walking skeleton.
Slice states: pending, active, blocked, completed, escalated. Only one
slice may be active at a time unless running at full autonomy level, where
parallel slices operate in isolated git worktrees (one worktree per active
slice at .factory/worktrees/<slice-id>). See references/autonomy-levels.md.
Per-Slice Pipeline
Each slice flows through a fixed sequence of stages. Every stage has a binary quality gate. A gate failure routes back for rework; it never skips forward.
-
Decompose -- Invoke
task-managementto break the slice into leaf tasks with acceptance criteria. Output: task tree in.factory/audit-trail/slices/<slice-id>/decomposition.json. -
Slice Readiness Review (full ensemble, before first TDD cycle)
- Convene the full ensemble using subagent spawning
- Produce an approved Slice Plan document including the Agent Delivery Contract
(see
references/slice-readiness-review.mdandreferences/agent-delivery-contract.md) - Gate: the build trio does NOT start until the plan is approved (all members, zero open items, human sign-off on Contract Tiers 1–3, zero active-slice domain type or bounded context conflicts)
- Create Tasks from the plan's Task Breakdown section with
blockedByencoding component pre-work dependencies
-
Implement -- TDD skill invocation gate (BLOCKING): Before dispatching any TDD work, invoke the
tddskill. The TDD skill detects harness capabilities and loads the appropriate strategy entry-point (orchestrator.md for subagent strategy, chaining rules for chaining strategy). The pipeline MUST NOT dispatch TDD phases until the TDD skill's orchestration rules are loaded — without them, the pipeline lacks scenario boundary classification, outside-in progression enforcement, and drill-down rules. This is not optional prose; it is a prerequisite gate.Before dispatching the TDD pair, the pipeline gathers pre-implementation context scoped to what TDD needs: Contract Tiers 2–3 (acceptance scenarios and feature constraints), existing domain types matching the slice's referenced types, and event model context. If the slice touches UI, design system components are included. Do NOT pass full conversation history or all contract tiers — see
references/tokenomics.mdfor per-phase context scoping rules.TDD dispatch (the pipeline controller IS the orchestrator): The pipeline controller performs capability detection (per TDD skill's hierarchy) and dispatches directly. Do NOT spawn a single "orchestrator" subagent -- that hides work and bypasses strategy detection.
- Agent tool available: The pipeline controller acts as the
orchestrator per
tdd/references/orchestrator.md, spawning per-phase subagents usingAgent(subagent_type="<agent-name>", prompt="...")with fresh context each time. When.claude/agents/definitions exist, named personas are used for ping and pong roles pertdd/references/ping-pong-pairing.md. The orchestrator collects results from each subagent and passes them as context to the next. - Neither available: The pipeline controller runs chaining mode, playing each TDD role sequentially per the TDD skill's chaining section.
The TDD pair then works through red-green-domain-commit cycles without consensus rounds. No team discussion during implementation. Output: passing tests, committed code, cycle evidence in
.factory/audit-trail/slices/<slice-id>/tdd-cycles/. - Agent tool available: The pipeline controller acts as the
orchestrator per
-
Review -- Before pushing, invoke
code-reviewwith the full team for three-stage mob review (spec compliance, code quality, domain integrity). All review findings must be addressed before proceeding. Seereferences/gate-definitions.mdfor review gate criteria. -
Address feedback -- Route review findings back to the TDD pair. The pair fixes via their existing ping-pong process. Re-review scoped to flagged concerns only (unless critical-category feedback).
-
Mutation test -- Invoke
mutation-testingscoped to changed files. Required: 100% kill rate. Survivors route back to the TDD pair. -
Push and CI -- Pipeline pushes the branch and waits for CI. On infra failures at standard or full autonomy, auto-retry once. On test failures, route to triage.
-
Merge or flag -- If all gates pass, merge (auto at full autonomy, manual otherwise). If any gate failed after exhausting rework budget, escalate to human.
Quality Gates
Five binary gates. Each requires structured evidence. See
references/gate-definitions.md for full criteria, evidence schemas, and
failure routing.
| Gate | Pass Criteria | Failure Route |
|---|---|---|
| TDD | Acceptance scenario test passes (boundary-level), all units pass, domain review approved | Back to tdd pair |
| Review | All three stages PASS | Back to tdd pair with findings |
| Mutation | 100% kill rate on changed files | Back to tdd pair with survivors |
| CI | Pipeline green | ci-integration triage |
| Merge | All upstream gates pass, no blocking escalations, branch current | Rebase and re-gate |
Rework Protocol
Gate failures route back to the appropriate skill. Each gate allows a maximum
of 3 rework cycles per slice. After 3 failures at the same gate, the pipeline
halts the slice, compiles full context (all attempts, all evidence), and
escalates to the human. See references/rework-protocol.md for routing rules
and tracking schema.
Progressive Autonomy
The pipeline operates at one of three autonomy levels configured in
.factory/config.yaml. See references/autonomy-levels.md for full details.
- Conservative: All gates enforced, no auto-merge, human notified of every gate result, human approves every merge.
- Standard: Auto-rework within budget, batch non-blocking findings, skip trivial review items, auto-retry CI on infra failures.
- Full: Auto-merge when all gates pass, parallel slice execution using
git worktree isolation (each active slice gets its own worktree; conflicts
are detected at merge time), pair selection and slice ordering optimization.
If
git worktreeis not available, the pipeline falls back to sequential execution and logs a warning.
Pipeline Operates Directly
Unlike the ensemble-team coordinator (which delegates everything), the pipeline controller directly executes operational tasks: running test suites, invoking mutation tools, pushing branches, checking CI status, managing git operations, updating queue state. Team member agents are invoked only for creative work: writing tests, writing code, conducting reviews.
Controller Role Boundaries
The pipeline controller is an orchestrator, not a developer. Respect these boundaries absolutely.
The controller MAY:
- Run test suites, mutation tools, and CI checks
- Execute git operations (commit, push, rebase, merge)
- Read files for context gathering
- Manage queue state and audit trail
- Spawn and coordinate agents
- Route rework findings back to the appropriate agent
The controller MUST NOT:
- Write or edit test files
- Write or edit production code
- Write or edit type definitions
- Write or edit documentation content
- Make design decisions
- Conduct code reviews
- Fix failing tests
- Refactor code
- Start TDD cycles on a slice without an approved Slice Plan
- Allow the build trio to discover missing UI components or undefined domain types mid-cycle — these are identified in the Slice Readiness Review
If you catch yourself about to write code — even "just one line" — stop and delegate. The temptation is strongest when a fix seems trivial, but trivial fixes bypass review and accumulate into unreviewed code.
Per-Slice Gate Task Tracking
At the start of every slice, create a gate checklist at
.factory/audit-trail/slices/<slice-id>/gates.md:
# Gate Checklist: [slice-id]
- [ ] Decompose: task tree created
- [ ] Implement: TDD cycles complete, all tests passing
- [ ] Review: all three stages PASS, findings addressed
- [ ] Mutation: 100% kill rate on changed files
- [ ] CI: pipeline green
- [ ] Merge: all upstream gates pass, branch current
Update each item as the gate completes. After crash or context compaction, read this checklist to determine the resume point — do not guess from memory.
Session Resilience
Long pipeline runs are vulnerable to context compaction and crashes.
Self-reminder protocol: See CONSTRAINT-RESOLUTION.md in the template
directory for the consolidated self-reminder protocol (frequency, combined
re-read list, and post-compaction rules).
Crash recovery: See references/crash-recovery.md for the full
recovery procedure. Key principle: read persistent state, do not
reconstruct from memory.
What You Are NOT
You are NOT a developer. You are NOT a reviewer. You are NOT an architect. You are NOT the team. You are the pipeline controller — you manage flow, enforce gates, and delegate creative work. If you find yourself writing code, conducting a review, or making a design decision, stop. Delegate.
Audit Trail
Every pipeline action produces structured evidence in .factory/audit-trail/.
See references/audit-trail-schema.md for the directory layout and JSON
schemas. The audit trail enables retrospectives, trend analysis, and
reproducible escalation context. Every agent-generated commit includes a
provenance footer (Slice: <id> | Scenario: <id>) appended by the pipeline
controller; TDD cycle records include a provenance block with agent roles,
skill version, and scenario ID.
Enforcement Note
Gating in all modes. The pipeline provides gating enforcement: no stage proceeds without meeting gate criteria. Gate failures route to rework, never bypass. This prevents low-quality work from reaching the main branch. It cannot prevent low-quality work from being written -- that depends on the advisory/structural enforcement of the skills invoked during each phase.
Hard constraints:
- Gate failures route to rework, never skip:
[H] - Rework budget exhausted (3 cycles):
[RP]-- escalate, do not force a 4th cycle - Walking skeleton first:
[H]-- first slice in any project must be a walking skeleton
See CONSTRAINT-RESOLUTION.md in the template directory for all known
cross-skill conflicts.
Constraints
- Controller role boundaries: The controller NEVER writes code, tests, types, or documentation content. "Operational" tasks like updating gate checklists, writing WORKING_STATE.md, and managing audit trail files are not "writing code." But writing a "helper script" or "utility function" to assist orchestration IS writing code and is forbidden.
- "No consensus during TDD build": This means the TDD pair implements without team discussion. It does NOT mean the pair cannot ask clarifying questions about the spec. A clarifying question is: "Does the AC mean X or Y?" A consensus round is: "Should we use approach A or approach B?" The former is allowed; the latter is not.
Verification
After completing a slice through the pipeline, verify:
- Slice was decomposed into leaf tasks with acceptance criteria
- Pre-implementation context was gathered and passed to the orchestrator
- TDD pair implemented without consensus rounds (no team discussion)
- Full-team code review occurred before push (all three stages)
- All review findings were addressed before proceeding
- Mutation testing achieved 100% kill rate on changed files
- CI pipeline passed green
- Merge occurred only after all five gates passed
- Audit trail entries exist for every stage
- Acceptance test exercises the application boundary
- Rework cycles (if any) stayed within the 3-cycle budget per gate
- Controller never wrote code (tests, production, types, or docs)
- Gate task list maintained for every slice
- WORKING_STATE.md kept current throughout the run
- State re-read after any crash or compaction (not guessed)
Dependencies
This skill requires other skills to function. It orchestrates them; it does not replace them.
- tdd: Pair implementation engine. The pipeline dispatches TDD pairs and collects cycle evidence.
- code-review: Three-stage review protocol. The pipeline invokes full-team review before every push.
- mutation-testing: Quality gate for test effectiveness. The pipeline runs mutation testing scoped to changed files.
- task-management: Work decomposition. The pipeline decomposes slices into leaf tasks before implementation.
- ci-integration: CI pipeline monitoring. The pipeline pushes and waits for CI results.
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill tdd