relay-plan
Relay Plan
Build a scoring rubric from the task's intended outcome: explicit Acceptance Criteria (AC) when available, inferred Done Criteria when missing or incomplete, repo quality signals, historical relay signal, and task-specific risk. Then generate a dispatch prompt that drives autonomous iteration until convergence.
Process
1. Read the task
Read the normalized task source (try in order, use first that succeeds):
- Relay-ready handoff brief from relay-intake:
~/.relay/requests/<repo-slug>/<request-id>/relay-ready/<leaf-id>.md - Local task file:
backlog/tasks/{PREFIX}-{N} - {Title}.md - GitHub:
gh issue view <N> - User-provided description
If relay-intake already produced a handoff brief, treat that file as the source of truth instead of re-reading the raw request.
2. Read historical signal
Before designing the rubric, read relay reliability history:
node ${CLAUDE_SKILL_DIR}/../relay-dispatch/scripts/reliability-report.js --repo . --json
Use historical_signal.stuck_factors, divergence_hotspots, and avg_rounds to tighten factor wording and calibration. The signal does not gate dispatch or alter state. Empty/failure cases render as no historical data available; details: references/signals.md.
3. Read probe quality signals
Before designing the rubric, read repo-local quality signals:
node ${CLAUDE_SKILL_DIR}/scripts/probe-executor-env.js . --project-only --json
Use probe_signal.test_infra, lint_format, type_check, ci, and scripts to inform rubric design, prerequisites, and Available Tools. The signal exposes data; it does not pick. No-signal/failure cases render as no quality infra detected; details: references/signals.md. The test_infra field is consumed by references/rubric-pattern-tdd-flavor.md and scripts/tdd-suggestion.js.
4. Recover Done Criteria
Before drafting factors, identify the evaluation source model:
- Explicit AC from the task source, when present
- Inferred Done Criteria from user intent, issue body, relay-intake handoff, and nearby repo conventions
- Repo quality signal from probes and available commands
- Historical relay signal from stuck factors and score divergence
- Task-specific risk from touched domains, trust boundaries, data loss, migrations, UX flows, or operational failure modes; derive
task_profileperreferences/task-profile.md - Selected guidance pack names from
task_profile.guidance_packs, usingreferences/guidance-packs.mdas the compact advisory pack library
If AC are missing, vague, or incomplete, write observable Done Criteria first. Treat explicit AC as high-priority evidence, not the only source. If the final review anchor is planner-authored or differs from the task source, persist it in step 8 so the reviewer has the same anchor.
5. Build the rubric
Use the guided interview (references/rubric-design-guide.md) to synthesize factors from the recovered Done Criteria, or convert directly:
rubric:
prerequisites:
- command: "npm test"
target: "exit 0"
factors:
- name: API returns cursor-paginated response
tier: contract
type: automated
command: "curl -s localhost:3000/api/items?limit=10 | jq '.next_cursor'"
target: "non-null cursor string"
weight: required
- name: Pagination robustness
tier: quality
type: evaluated
criteria: "Last page works; cursor is opaque and stable under writes."
scoring_guide: { low: "happy path only", mid: "last page handled", high: "opaque stable cursor" }
target: ">= 8/10"
weight: required
Tier classification, type, weight, setup/baseline, criteria, scoring_guide, and optional per-factor tdd_anchor / tdd_runner: see references/rubric-design-guide.md. For event-schema evolution, use the event-shape rubric pattern. For red-first factor opt-in, use the TDD factor flavor pattern. For factors that name file paths, test names, or grep tokens, use the grep-token precision pattern.
Domain references
Consult references/rubric-*.md for frontend, backend, security, refactoring, documentation, and design thinking. Design factors from task-specific evidence and risk, informed by references.
Trust-model audit factor (auth-boundary tasks)
If the task crosses an auth boundary (trust root, anchor, invariant, validate, forge, bypass, gate-check, auth-boundary, or validateTransition* / validateManifest* / evaluateReviewGate), follow references/rubric-trust-model.md. Each question becomes a named factor. Record answers under ### Trust-model audit in the PR body before dispatch.
Fail-closed pattern library
If the task touches relay gates, resolver selectors, recovery paths, audit stamps, or lock/deadline fallthrough behavior, apply references/rubric-fail-closed-patterns.md. Use it to split visible warnings from blocking enforcement and to enumerate sibling states, selectors, call sites, and downstream consumers.
6. Validate the rubric
Quick gate before dispatch:
- Prerequisites hold repo-wide hygiene only; factors stay substantive (tier test)
- Contract/Quality tier minimums met for task size (S/M/L/XL)
- S-size mechanical tasks may use 1 contract factor and no quality factor; do not invent quality factors just to fill a quota
- ≥ 1 automated check across prerequisites + factors
- Every evaluated factor has
scoring_guidewith low/mid/high anchors - Criteria are specific and reference discoverable artifacts; targets are concrete
Full checklist, factor counts, grading, and risk signals: references/rubric-validation.md. Grade D = revise; Grade C = warn and state the tradeoff.
7. Simplify the rubric
Before persisting the draft rubric, apply the 6 heuristics in references/rubric-simplification.md.
Apply to all task sizes: rewrite HOW into observable WHAT, merge overlaps, remove unsupported defensive clauses, and verify weights.
8. Persist planner-authored Done Criteria
If operator planning writes the final Done Criteria, persist that decision before dispatch so fresh-context review uses the same anchor. This includes AC-missing inputs, user-provided descriptions, and any case where planning expands, rejects, or narrows issue-body AC:
node ${CLAUDE_SKILL_DIR}/scripts/persist-done-criteria.js --repo . \
--run-id "$RUN_ID" --file /tmp/done-criteria-<N>.md --json
Dispatch with the same RUN_ID and --done-criteria-file ~/.relay/runs/<repo-slug>/$RUN_ID/done-criteria.md. Skip this step only when the issue or intake handoff already provides the final Done Criteria without planner changes.
9. Review the rubric (triggered by ambiguity/risk)
S/M usually skips, but ambiguity or risk can opt any size into stress-test. Run stress-test for L/XL rubrics with evaluated factors and an ambiguity/risk signal, and for smaller rubrics when the recovered Done Criteria are novel, vague, or easy to game. XL adds calibration simulation only when novel or subjective evaluated factors need it. Skip re-dispatches with iteration history, all-automated rubrics, and simple tasks where recovered Done Criteria map cleanly to checks. Protocol: references/rubric-stress-test.md.
10. Generate dispatch prompt
Take the base template (../relay/references/prompt-template.md) and append Setup, optional task_profile metadata plus Working Guidance when guidance packs are selected, Scoring Rubric, Iteration Protocol, and Score Log sections. Selected pack names and prompt-ready guidance bullets come from references/guidance-packs.md; the Working Guidance section is advisory and must state that it does not override Done Criteria, rubric commands, or scope boundaries. Insert the optional Step 0a block from references/iteration-protocol.md iff any factor has a non-empty tdd_anchor; when no factor has tdd_anchor, keep the emitted prompt identical to the pre-TDD baseline.
Full iteration-protocol text + Score Log format: references/iteration-protocol.md.
11. Dispatch
Write the rubric YAML to a temp file alongside the dispatch prompt. Every relay dispatch must pass --rubric-file so the rubric is persisted at anchor.rubric_path for review and merge gates.
node ${CLAUDE_SKILL_DIR}/../relay-dispatch/scripts/dispatch.js . \
-b issue-42 --prompt-file /tmp/dispatch-42.md --rubric-file /tmp/rubric-42.yaml --timeout 3600
When to use
All tasks dispatched via relay. Rubric depth scales with task size (determined by orchestrator judgment on recovered Done Criteria, file scope, ambiguity, and risk, not raw issue AC count):
- S (simple fix, typo, 1-liner): 1 contract factor; add a quality factor only when the task has real design judgment; skip stress-test
- M (standard feature): 3-5 factors, skip stress-test
- L (cross-cutting, multi-file): 4-6 factors; stress-test only when evaluated factors plus ambiguity/risk signal exist
- XL (architecture change): 5-8 factors; stress-test only when evaluated factors plus ambiguity/risk signal exist; add calibration only when useful
Re-dispatches automatically prepend previous Score Log + reviewer feedback to the prompt (see relay-dispatch docs). Full rubric guide: references/rubric-design-guide.md.