fix-verify-loop
Fix-Verify Loop
fix-verify-loop is a bounded resolver. Input: confirmed P0/P1 findings. Output: staged code where each listed finding is resolved, OR an escalation list of findings it could not resolve in 2 attempts.
It verifies ONLY per-finding resolution. Regression detection, finding new issues, and reviewing the diff as a whole are explicitly NOT its job — those belong to whoever reviews the diff next (caller's choice).
When to use
After a two-pass-review (or any review) produces confirmed P0/P1 findings that need fixing. Also after test failures that need code or test fixes.
Protocol
Input
- Findings: confirmed P0/P1 findings conforming to the Output Schema below
- Artifact paths: files to fix
- Criteria: the original criteria the fix must satisfy
Intake filter: Only process findings where verdict = "confirmed" and severity in ["P0", "P1"]. Ignore P2/P3 findings — they are out of scope. Note: findings with verdict: "confirmed" are accepted regardless of source — the caller is responsible for confirmation quality (e.g., plan-runner sets this as an orchestrator escape hatch for per-wave review findings that have no verifier pass).
Pre-gate (findings without an independent verifier pass): A finding requires a verifier pre-gate BEFORE Round 1 if it has not been verifier-validated. Detect via:
evidencefield is null or missing, ORevidencefield starts with "Orchestrator-confirmed".
Confidence is the reviewer or verifier's honest signal — it does not by itself trigger pre-gate. Pre-gate's job is to catch findings that haven't been independently verified, not findings that the verifier already vouched for at low confidence.
For these findings, run this pre-gate first:
- Spawn the
verifieragent with: Artifact = the finding's file (or surrounding context), Findings = the single finding, Criteria = "is this finding real?". - If verifier returns
rejected→ adds todroppedbucket → drop the finding from fix-verify-loop. Skip to next finding. - If verifier returns
confirmedordemoted(still real) → proceed to Round 1 normally. - If verifier output is inconclusive (no parseable verdict) → treat as a failed attempt. Proceed to Round 1 anyway (this consumes one of the two attempts; see inconclusive rule below).
Findings that pass the intake filter and either don't trigger the pre-gate or pass it proceed to Round 1.
Per-finding loop
Process findings sequentially, one at a time. Do not batch. For each finding, run Round 1; if not resolved, run Round 2; if still not resolved, escalate. Round 1 and Round 2 mirror each other in shape — same fix-then-verify dispatch, same verifier question.
Preconditions — pre-staged hunk check
This check fires in Round 1 only, AFTER the fix subagent declares files_changed and BEFORE staging. The fix subagent doesn't know what files it will touch until it runs, so the Round 1 order is: spawn fix → check pre-staged hunks → stage → verify. In Round 2, any pre-existing staged content is from Round 1's prior attempt and not user-authored — the check would fire spuriously, so we skip it. Round 2 order is: spawn fix → stage → verify.
Before staging (Round 1 only):
- Run
git diff --staged -- <files the fix will touch>(thefiles_changedreturned by the fix subagent). - If non-empty (pre-existing staged hunks exist in those files):
- Inspect the hunks. Form a heuristic judgment:
- No overlap with the fix's likely lines, hunks small, look unrelated → recommend "Commit pre-existing first"
- Overlap with the fix's lines, OR hunks large/sprawling → recommend "Stash pre-existing"
- Hunks clearly continue the fix's logical change → recommend "Proceed (treat as part of this fix)"
- Use the
AskUserQuestiontool with options "Commit pre-existing first", "Stash pre-existing", "Proceed (treat as part of this fix)". Include a one-line summary of what was found (e.g., "3 hunks in auth.js totaling 18 lines, no overlap with fix's edits"). Surface the heuristic recommendation as the first option labeled "(Recommended)".
- Inspect the hunks. Form a heuristic judgment:
Round 1 — Fix + Verify
- Fix. Spawn a fix subagent (Sonnet for scoped fixes, Opus for cross-file fixes). Subagent receives: the finding (full schema), the affected file path(s), the criterion it violates. Subagent edits the working tree and returns
{ files_changed: [paths], summary: string, concerns: [string] | null }. - Pre-staging check. Run the Preconditions check on
files_changed. - Stage.
git add <files_changed>— do NOT commit. The user decides when to commit. - Verify. Spawn the
verifieragent with:- Artifact: the staged diff scoped to this finding's files (
git diff --staged -- <files_changed>) - Findings: the original finding being fixed (single)
- Criteria: ONLY "is this finding resolved?"
- Output contract: "Return a ReviewOutput envelope (see Output Schema) with a verdict on the single finding."
- Artifact: the staged diff scoped to this finding's files (
- Decide. Map the verifier's verdict on the finding:
confirmed→ still real and still in scope (P0/P1) → not resolved → proceed to Round 2.rejected→ not a real issue → adds toresolvedbucket → done.demoted→ check the new severity:- new severity in [P0, P1] → still in fix-verify-loop scope → proceed to Round 2.
- new severity in [P2, P3] → out of fix-verify-loop scope → adds to
demotedbucket → done (the issue exists at lower severity; hand-off implicit).
- Inconclusive (the verifier subagent fails to return a parseable ReviewOutput — crash, malformed output, no verdict on the finding) → counts as a failed attempt. Proceed to Round 2. No retry loop.
Round 2 — Fix + Verify
Same shape as Round 1. The only difference is the fix subagent gets Round 1 context.
- Fix. Spawn another fix subagent with full Round 1 context (what was attempted, why it didn't work, the verifier's evidence). Subagent returns
{ files_changed: [paths], summary: string, concerns: [string] | null }. - Stage.
git add <files_changed>. (No pre-staging check in Round 2 — see Preconditions for why.) - Verify. Same dispatch as Round 1 — verifier asked "is this finding resolved?" on the single finding.
- Decide. Map the verifier's verdict on the finding:
confirmed→ still real and still in scope (P0/P1) → not resolved → adds toescalatedbucket → escalate (do not attempt Round 3).rejected→ not a real issue → adds toresolvedbucket → done.demoted→ check the new severity:- new severity in [P0, P1] → still in fix-verify-loop scope → adds to
escalatedbucket → escalate. - new severity in [P2, P3] → out of fix-verify-loop scope → adds to
demotedbucket → done (the issue exists at lower severity; hand-off implicit).
- new severity in [P0, P1] → still in fix-verify-loop scope → adds to
- Inconclusive (the verifier subagent fails to return a parseable ReviewOutput) → counts as a failed attempt → adds to
escalatedbucket → escalate. No retry loop.
Escalation
If Round 2 still has the finding unresolved:
- STOP. Do not attempt Round 3.
- Present to the user:
- The original finding
- What was attempted in Rounds 1 and 2
- What's still unresolved and why (verifier's evidence)
- A one-line summary of what's currently staged for this finding (derive via
git diff --staged --stat -- <files_changed>, e.g., "Currently staged: R2's changes to auth.js, +12/-4 lines") - Use the
AskUserQuestiontool with options: "Manual fix", "Try a different approach", "Defer this finding", "Discard R2 changes and revert". Recommended: "Defer this finding"
After all findings are processed, return a FixVerifyLoopOutput envelope with four buckets:
- resolved: finding IDs marked done in Round 1 or Round 2 (with the staged files)
- escalated: findings that hit Round 2 without resolution, with attempt summaries
- dropped: findings the pre-gate verifier rejected as not-real
- demoted: findings demoted to P2/P3 by the verifier (out of fix-verify-loop scope)
Bucket assignment by verdict path:
- Pre-gate
rejected→dropped - Pre-gate
confirmedordemoted→ proceed to Round 1 - R1
rejected→resolved - R1
confirmed(still real, P0/P1) → proceed to Round 2 - R1
demotedto [P0, P1] → proceed to Round 2 - R1
demotedto [P2, P3] →demoted - R2
rejected→resolved - R2
confirmed(still real, P0/P1) →escalated - R2
demotedto [P0, P1] →escalated - R2
demotedto [P2, P3] →demoted
Rules
- Max 2 attempts. Never loop beyond Round 2.
- Scoped fixes. Fix subagents must NOT edit files outside the scope of their finding without user approval. If a fix requires additional files, the subagent must FIRST return
{ needs_scope_expansion: true, additional_files: [paths], justification: string }instead of making edits. The parent then uses theAskUserQuestiontool with options: "Approve expanded scope", "Reject — fix within original scope only", "Defer this finding". Recommended: "Approve expanded scope" (include the justification and file list). On approval, re-dispatch the subagent with expanded scope. Additional files are included in verification. - Always verify. Every fix gets the verifier-on-one-question check. Don't skip — the verifier agent is the only verification mechanism this skill uses.
- Test failures. Determine if it's a code bug or test bug first, then fix the right one. This judgment happens during the fix subagent's work — the subagent inspects the failure and the test, decides which is wrong, and fixes the right one. The skill itself does not branch on this; it's part of the fix subagent's task.
Output Schema
FixVerifyLoopOutput
The skill returns this envelope after all findings are processed:
{
resolved: [Finding.id, ...], // fixed in R1 or R2
escalated: [{ // could not be fixed in 2 attempts
id: Finding.id,
attempts: [string, string], // R1 + R2 summaries
evidence: string | null, // verifier's evidence (null if R2 was inconclusive)
staged_summary: string // e.g., "R2's changes to auth.js, +12/-4 lines"
}, ...],
dropped: [{ // pre-gate verifier rejected as not-real
id: Finding.id,
reason: string // verifier's rejection evidence
}, ...],
demoted: [{ // demoted to P2/P3 (out of scope)
id: Finding.id,
new_severity: "P2" | "P3",
evidence: string // verifier's demotion reasoning
}, ...]
}
Finding
Finding {
id: sequential number starting from 1,
severity: "P0" | "P1" | "P2" | "P3",
title: short title,
body: detailed explanation with evidence,
file: file path or null for global issues,
line_start: number or null,
line_end: number or null,
confidence: 0.0-1.0,
criterion: what was violated,
verdict: "confirmed" | "demoted" | "rejected" | null,
evidence: reasoning for verdict | null
}
ReviewOutput
Findings are wrapped in a ReviewOutput envelope:
ReviewOutput {
schema_version: "v1",
findings: Finding[],
checks_run: string[]
}
Severity calibration
- P0 — Must fix: breaks functionality, security breach, data loss, or violates criteria
- P1 — Fix before shipping: correct but incomplete, fragile, or reliability risk
- P2 — Should fix: quality issue, code smell, not blocking
- P3 — Nice to have: observation, style, minor improvement
Field notes
confidence— 1.0 means certain, below 0.5 means you're guessing. Be honest.criterion— required for P0/P1 findings. Name the specific criterion violated.verdict— populated by the verifier in two-pass review. Set tonullwhen producing findings directly.evidence— verifier's reasoning for the verdict. Set tonullwhen producing findings directly.checks_run— list every criterion evaluated, file path checked, or acceptance criterion verified. For ACs, useAC-N: PASS — [evidence]orAC-N: FAIL — [reason].
More from preetamnath/agent-skills
shopify-dev-mcp
Routes Shopify Dev MCP calls for surfaces NOT covered by the bundled Shopify skills: `storefront-graphql`, `customer`, `partner`, `payments-apps`, `functions`, `hydrogen`, `liquid`, `custom-data`. SKIP for Admin GraphQL or App Home markup — the bundled `shopify-admin` and `shopify-polaris-app-home` skills cover those. SKIP entirely for `@shopify/post-purchase-ui-extensions-react` — the MCP doesn't index that legacy SDK; use `post-purchase-ui-extension` instead.
15plan-builder
Creates dependency-ordered, wave-grouped executable plans from a goal + context. Produces markdown plans with parallel execution waves compatible with plan-runner. Use when you need to break a goal into sequenced, atomic work items before execution.
10polaris-web-components
Polaris <s-*> web component catalog, props/slots reference, and layout patterns for the polaris-app-home surface. Use when writing/editing markup with <s-*> tags (<s-page>, <s-section>, <s-stack>, <s-button>, <s-modal>, etc.), building UI in the Admin App Home SPA, looking up a Polaris component's props/slots/tone/size, or calling App Bridge APIs (shopify.toast, shopify.resourcePicker).
7propose-alternatives
Propose 2-3 genuinely different approaches to a problem with concrete trade-offs. Use when evaluating design choices, exploring options, or challenging the current approach.
6codex
Get an independent second opinion from Codex (OpenAI) via MCP. Modes: review, alternatives, sanity-check. Use when user asks for Codex input, a second perspective, or invokes /codex.
2polaris-app-home-web-components
Polaris `<s-*>` web component catalog for the Admin App Home surface. Use when writing/editing Polaris markup — `<s-page>`, `<s-section>`, `<s-button>`, `<s-modal>`, tables, forms, layout. TRIGGER when: code contains `<s-*>` tags; user asks to build/update Admin App Home UI.
2