vibe-planning-guard
Vibe Planning Guard
Overview
Create plans that survive contact with reality. Treat the user as having only rough knowledge of the target behavior unless the workspace proves otherwise, and convert that rough request into a verified, option-aware plan instead of a confident guess.
When a real design branch exists, compare 2-3 viable options, recommend the safest path for the current phase, and keep verified facts separate from expert judgment.
This skill is for planning, not coding. Do not edit files or propose implementation as ready to start until the verification gate passes.
Rule Precedence
Apply rules in this order when instructions compete:
Non-Negotiable RulesWorkflowReport StructureCommunication Style
Interpret stop consistently: stop implementation for the current phase, keep planning alive, and output only proof-gathering, test-lock, or handoff material until the blocker is resolved.
Scale the Process
Use the lightest safe mode for the task size and risk.
light: Localized or low-blast-radius work such as a small refactor, a focused bug fix, a single integration touchpoint, or a narrow UI change in a known stack. Use a compact report, ask only 0-2 plan-changing questions after workspace scan, and compare options only when a real design branch remains. Still label every implementation-relevant assumption with an evidence class.strict: Multi-file or high-blast-radius work such as migrations, rewrites, replacement or restoration of behavior, external integrations, auth, billing, data model changes, performance work, security-sensitive flows, or anything with unclear feasibility. Use the full report, ask only questions that resolve a blocker, and use structured comparison when ambiguity affects behavior, tests, data, or external contracts.- Choose
strictimmediately when auth, billing, security, migrations, external contracts, restoration work, or anyhighorcriticalUnprovenitem is involved. - Choose
strictimmediately when a behavioral equivalence analysis contains anyUnknowndimension or when amust preservedimension has been reclassified toin scope for changeduring analysis. - Choose
strictimmediately when the change touches shared static state, singletons, global event buses, or hook/subscriber ordering. - Choose
strictimmediately when the change touches persisted user config, default values for absent fields, opt-out paths, or schema migrations. - Choose
strictimmediately when the change touches the build, package, or release path, including manifest version bumps and entry-point command sources. - Choose
strictimmediately when the change crosses a client/server or otherwise untrusted-payload trust boundary. - Choose
strictimmediately when the change touches external contracts, execution sequencing, nonce or replay protection, FIFO ordering, or snapshot timing relative to writes. - Stay in
lightonly when the slice is localized, the main behavior contract is already mostly clear, and any remaining uncertainty is deferred or non-blocking for the current phase. - If the choice feels borderline, choose
strictand keep the report concise rather than weakening the safety bar.
Do not use light mode as an excuse to skip verification. It only reduces report weight and encourages tighter scoping. The verification bar does not change across modes.
Non-Negotiable Rules
- Assume missing detail, hidden constraints, and false certainty are the default failure modes.
- Include only actions whose feasibility is supported by
Primary sourceorLocal reproduction. - Label every fact, assumption, and dependency with exactly one evidence class:
Primary source,Local reproduction, orUnproven. - State something as fact only when it is backed by
Primary sourceorLocal reproduction. - When recommending an option, keep verified facts,
Unprovenblockers, and expert judgment visibly separate. - When a request replaces, restores, rolls back, or rewrites existing behavior, inspect a previously-good commit before planning the change. If no known-good commit can be verified, mark that gap as
Unprovenand stop. - When a change touches existing behavior — whether through replacement, refactoring, migration, internal implementation change, or explicit specification change — perform the behavioral equivalence analysis from
references/behavioral-equivalence-analysis.md. Separate each comparison dimension intoin scope for changeormust preserve, classify every dimension explicitly, and include the analysis in the plan output. If anymust preservedimension is found to be non-equivalent, stop and report to the user before continuing. If any dimension is classifiedUnknown, the change is blocked until proof is provided. - Define tests before implementation steps. For discovery-only phases, define proof checks and exit criteria before any spike or experiment.
- When the change touches existing behavior, build the behavior contract inventory from
references/behavior-contract-inventory.mdbefore classifying any dimension in the behavioral equivalence analysis. The inventory must label each of its three buckets withPrimary source,Local reproduction, orUnproven. - Apply
references/plan-boundary-controls.md(plan-body firewall) before emitting the final plan output. Every addition is classified,impl-detailis deferred unless it proves feasibility or defines a current-slice test, andhistory-onlymaterial is collapsed or dropped. - Once the current-slice success criteria are recorded, freeze them. Admit a later addition only when it cites a user requirement, a newly verified source, or a
must preservedimension that turned out to be non-equivalent. Suggestions outside those bases go into deferred decisions orUnprovenitems, not into the success criteria. - When the user asks for a plan in response to diagnostic findings, review comments, audit output, or analyzer warnings, default to the smallest corrective slice that directly addresses those findings. A diagnostic finding is evidence of a gap, not permission to add adjacent hardening, new modes, new detectors, or extra policy surfaces unless the finding itself or a verified source requires them.
- Apply the completion gate from
references/plan-boundary-controls.md. Once the gate passes for the current slice, stop adding plan detail. The completion gate does not override theUnprovenstop rule — anyUnprovenitem withPhase relevance: implementationfor the current slice keeps the plan blocked, regardless of risk level. Risk level alone never exempts an item from the stop rule; onlyPhase relevanceoutside the current slice's implementation step does. - If even one implementation-relevant item remains
Unprovenfor the current phase, stop after the plan. Do not authorize implementation. Shrink the current slice if needed until the current phase has zeroUnprovenimplementation assumptions.
Workflow
- Ground the request in the workspace first.
- Read the relevant code, configs, tests, schemas, docs, commit history, and errors before asking the user.
- Prefer local inspection and reproducible commands over memory.
- Use primary sources for unstable facts such as framework behavior, product limits, versions, APIs, or policies.
- If the request spans independent subsystems, call that out and propose a decomposition before planning both at once.
- Prefer established local patterns over inventing a fresh structure for the current slice.
- Choose the smallest safe planning slice.
- Separate today's actionable slice from later phases.
- Move non-blocking or cosmetic uncertainty into deferred decisions when it does not affect the current code path, test design, or behavior contract.
- Move design branches that do not affect the current phase into deferred decisions instead of bloating the current plan.
- If the task is large, split it into discovery, test-lock, and implementation phases instead of forcing one giant plan.
- Slice selection comes before inventory and freeze (steps 3 and 4) so those operate on a defined
current slice. If discovery later reveals the slice is wrong, restart the workflow at this step rather than retrofitting inventory or freeze onto a different slice.
- Build the behavior contract inventory for the current slice when the change touches existing behavior.
- Read
references/behavior-contract-inventory.mdand populate the three buckets — immediate observable behavior, internal state transition, and persistent / lifecycle behavior — before classifying any dimension in the behavioral equivalence analysis. - Replacement / restoration branch: when the request replaces, restores, rolls back, or rewrites existing behavior, run
references/change-recovery-checklist.mdsteps 1-5 (identify target behavior, find the last known-good implementation, prove it was good, inspect it, build the two-column historical / current inventory) as a prerequisite to this step. The two-column inventory from the recovery checklist replaces this step's single-column inventory. The success-criteria freeze in step 4 then locks against the historical contract too, not just current behavior. Step 8's recovery flow becomes the residual / verification re-entry after the criteria are frozen. - Label every bucket entry with exactly one of
Primary source,Local reproduction, orUnproven.Unprovenbuckets are recorded with risk, phase relevance, and next review point, not silently merged with proven entries. - The inventory rows are the inputs to the equivalence dimensions. A dimension cannot be classified
Equivalentagainst anUnproveninventory entry. - Omit the inventory only when the current slice provably does not touch existing behavior. Refactors, migrations, and internal implementation changes count as touching existing behavior under the Non-Negotiable Rule and require the inventory even when most rows collapse to
Not applicablewith a documented evidence-backed rationale. Do not emit empty bucket headings to satisfy structure when omission is justified.
- Read
- State and freeze current-slice success criteria.
- Write the success criteria for the current slice (the slice chosen in step 2). Once recorded, lock them. Apply the Success Criteria Freeze rule from
references/plan-boundary-controls.mdbefore admitting any later expansion. - A later addition is admissible only when it cites a user requirement, a newly verified source, or a
must preservedimension that turned out to be non-equivalent. Other suggestions are recorded as deferred decisions orUnprovenitems with risk, phase relevance, and next review point. - When a
must preservereclassification triggers a success-criteria expansion, escalate tostrictmode and record the reclassification basis next to the new criterion. - Future-phase criteria belong in deferred decisions, not in the current-slice freeze. If discovery later forces a slice change, restart at step 2 — do not retrofit the freeze onto a different slice.
- Write the success criteria for the current slice (the slice chosen in step 2). Once recorded, lock them. Apply the Success Criteria Freeze rule from
- Build a fact table.
- Turn requirements, assumptions, constraints, and inferred dependencies into a table with evidence class, source, and impact.
- If the right evidence class is unclear, read
references/evidence-rubric.md.
- Surface ambiguity and design branches explicitly.
- First list every ambiguous or conflicting requirement and offer alternatives that clarify behavior.
- After the requirements are clarified, compare 2-3 viable design options only when a real design branch still remains.
- For each option, include tradeoffs, blocking unknowns, and YAGNI impact for the current phase.
- Recommend an option only when the decision inputs are sufficiently proven.
- Recommendation may include expert judgment, but present
Why (verified facts)separately fromWhy (judgment). - If key decision inputs are still missing, do not force a recommendation. Give a
Recommended proof pathinstead.
- Triage
Unprovenitems.- For each
Unprovenitem, record risk level:critical,high,medium, orlow. - Record whether it affects feasibility, behavior, data, integration, performance, security, UX, or cosmetic detail.
- Record whether it blocks discovery, test design, or implementation for the current phase.
- Use the risk level to order work and choose
lightversusstrict, but do not treat risk as permission to implement around anUnprovenblocker. - If a
lightplan contains anyhighorcriticalUnprovenitem, upgrade the report tostrict.
- For each
- Run the residual recovery checks for replacement or restoration.
- For replacement / restoration / rollback / rewrite work, the early branch in step 3 already executed
references/change-recovery-checklist.mdsteps 1-5 (target behavior, known-good commit, inspection, two-column inventory) before the freeze. This step covers the residual recovery checklist items (steps 6-9): historical/current side-by-side comparison, applicable failure-pattern checks for the recovery surface, plan from verified evidence only, and the recovery-flow's behavioral equivalence anchor. - For non-restoration work, this step is a no-op — skip it.
- If the early branch did not run because the work was reclassified mid-workflow as replacement/restoration, restart at step 2 with the restoration branch active. Do not retrofit recovery onto a frozen current-only contract.
- For replacement / restoration / rollback / rewrite work, the early branch in step 3 already executed
- Run failure-pattern checks for applicable high-risk surfaces.
- Walk the slice's surface through
references/failure-pattern-checklist.mdand pick only the sections whose preconditions match. Fold answers into facts, blockers, or tests; do not paste empty checklist headings into the plan. - This step runs before test-lock so checklist answers can shape the test plan rather than retroactively mutating a locked plan. If new test obligations or blockers surface in step 10 or later that should have been caught here, restart at this step rather than amending tests in place.
- Record the checklist applicability per
references/failure-pattern-checklist.md§Applicability Record so the completion gate has a verifiable artifact.
- Walk the slice's surface through
- Lock proof checks and tests.
- In discovery phases, define proof checks, spike exit criteria, and the evidence needed to clear each blocker.
- In implementation phases, define acceptance tests, regression tests, and important failure cases before implementation steps.
- Align the proof checks and tests with the chosen option or proof path.
- When a behavioral equivalence analysis has been performed, include test scenarios for every dimension classified as
EquivalentorChanged (in scope). A test plan that only verifies the immediate observable result is incomplete. Persistence round-trips, lifecycle events, external contract guarantees, and resource cleanup must be covered when those dimensions are relevant. - If any dimension from the behavioral equivalence analysis is classified
Unknown, add a proof check specifically for that dimension before implementation. - Incorporate the failure-pattern checklist answers from step 9 into the test plan (e.g., a step-9
A.1 lifecycleblocker implies a late-subscriber regression test). - If the behavior is not yet clear enough to test, stay in discovery. Do not smuggle implementation into the discovery phase.
- Apply plan-boundary controls and the completion gate.
- Walk every plan addition through
references/plan-boundary-controls.md(classification,impl-detailexception,history-onlycollapse, success-criteria freeze, plan-body firewall). - Apply the completion gate from
references/plan-boundary-controls.md. Once the gate passes for the current slice, stop iterating; do not extend the plan with additionalimpl-detailor speculative options. The completion gate respects theUnprovenstop rule and never overrides it. - The completion gate's "applicable failure-pattern checklist sections cleared" criterion reads the applicability record from step 9; missing or unjustified non-selections fail the gate.
- Walk every plan addition through
- Apply the verification gate.
- Use
references/design-exploration-rules.mdas a final self-check for comparison discipline, scope control, recommendation framing, and blocker visibility. - If any current-phase implementation-relevant item is still
Unproven, stop at the plan and say implementation must not begin. Deferred / non-current-phaseUnprovenitems remain documented withPhase relevanceandNext review point; they do not block the current phase under this gate. - If all current-phase implementation-relevant items are proven, deliver the plan and note that implementation may begin only in a later execution step.
- Use
Evidence Classes
Primary source: Official documentation, authoritative specifications, upstream source code, vendor docs, user-provided source material, or a known-good historical implementation. Use this for product claims, framework rules, and intended behavior.Local reproduction: Behavior confirmed in the current environment by reading local code, running non-mutating checks, reproducing a failure, or observing a passing test locally.Unproven: Memory, inference, secondhand summaries, forum advice, stale docs, unexecuted hypotheses, or missing historical evidence.
Never upgrade Unproven to fact. Convert it into a question, a discovery step, or a blocker.
When Primary Sources Are Unavailable
- Say exactly why the primary source is unavailable: network restriction, NDA boundary, missing access, removed docs, or absent vendor material.
- Fall back to checked-in specifications, vendored source, user-provided documents, release artifacts, and local reproduction where possible.
- If none of those can prove the claim, keep it as
Unproven. - Do not substitute memory or generic internet recollection for a missing primary source.
Expert Input
Treat expert intuition as useful input for option generation, risk ranking, and spike design, but not as established fact by itself.
- Record expert judgment as an
Unprovenlead until a primary source or local reproduction backs it. - Use expert input to decide the fastest proof path.
- Use expert input to shape recommendations only after separating verified facts from judgment.
- Do not use expert input alone to justify implementation steps.
User-Acknowledged Risk
If the user explicitly says they want to proceed despite an Unproven blocker, record that as Acknowledged risk for handoff, not as proof.
- Keep the evidence class as
Unproven. - Record the user rationale, impact area, owner or decision-maker, and revisit trigger.
- Keep the stop condition unchanged inside this skill.
Acknowledged riskdocuments a conscious exception request; it does not authorize implementation here. - Use this when the user is intentionally accepting delivery, schedule, or business risk and needs a precise record of what remains unproven.
Planning Rules Under Uncertainty
- Do not include speculative implementation steps whose success depends on an
Unprovenclaim. - Replace speculative implementation with the minimum proof-gathering task needed to turn that claim into
Primary sourceorLocal reproduction. - If feasibility itself is unproven, stop at the investigation plan. Do not draft production code steps behind it.
- If two verified sources disagree, report the conflict instead of choosing silently.
- If a decision is genuinely deferrable and does not affect the current phase, mark it as a deferred choice instead of letting it bloat the blocker list.
- Prefer the simplest proven path that serves the current phase over a richer but speculative design.
- Split work so each current-phase unit has one clear purpose, boundary, and test surface.
- When recommending an option, make it obvious which part is verified input and which part is engineering judgment.
Iterative Planning
Use phased planning when a full end-to-end plan would be blocked or too heavy.
At the start of every phase, revisit deferred choices from the previous phase. Either prove them, keep them deferred with a named next-review point, or promote them to blockers.
Discovery phase- Prove feasibility, gather missing sources, inspect prior behavior, and reduce uncertainty.
- Output only proof tasks, proof checks, and decision gates.
- Exit discovery only when the current slice has a stated behavior contract, a plausible approach backed by
Primary sourceorLocal reproduction, and no remaining unknowns that block writing tests for this slice.
Test-lock phase- Once the current slice is understood, define the tests and acceptance criteria needed before coding.
- Exit test-lock only when the tests, proof checks, and acceptance criteria are specific enough that another engineer could implement against them without inventing the missing behavior.
Implementation phase- Only after the current slice has no
Unprovenimplementation blockers.
- Only after the current slice has no
If discovery stops reducing uncertainty, shrink the slice, gather a missing source, or ask the minimum blocking question. Do not loop indefinitely on vague research.
MVP thinking is allowed only by shrinking scope until the MVP slice itself is verified. Do not label a speculative slice as safe just because it is small.
Report Structure
Choose the lightest format that still makes blockers obvious.
Use this compact structure for light mode. Omit Recommended approach or proof path if no real design branch exists:
# [Short plan title]
## Goal
## Verified facts
- Item — Evidence — Source
## Recommended approach or proof path
- Option or proof path:
- Why (verified facts):
- Why (judgment):
## Open questions, contradictions, deferred choices, or future-phase unproven items
- Item — Status: `contradiction` | `deferred` | `unproven` | `acknowledged-risk`
- Risk:
- Next proof or review point:
## Proof checks and test plan
- Proof checks or tests required before coding
## Next steps
1. [Discovery or implementation-prep steps only]
## Stop condition
- State explicitly whether implementation is blocked or allowed.
Use this full structure for strict mode:
# [Short plan title]
## Goal and success criteria
## Verified facts
| Item | Evidence | Source | Why it matters |
## Ambiguities and contradictions
- Issue:
- Options:
- Recommended option or proof path:
- Why (verified facts):
- Why (judgment):
- Blocking proof:
## Unproven items and required proof
- Item:
- Risk:
- Phase relevance:
- Missing proof:
- Fastest safe way to prove it:
- Next review point:
## Test plan
- Acceptance tests
- Regression tests
- Negative or edge cases
## Implementation plan
1. [Only proven steps]
2. [Only proven steps]
## Stop condition
- State explicitly whether implementation is blocked or allowed.
Include the following two sections in both modes when the change touches existing behavior. The behavior contract inventory comes first, the behavioral equivalence analysis comes second, and the equivalence table must visibly cite or align with inventory rows. Once these sections have been emitted, they must remain in the output regardless of subsequent mode changes, scope reclassification, or any other plan updates.
Use this compact inventory section for light mode:
## Behavior contract inventory (include when the change touches existing behavior)
| Bucket | Entry | Evidence | Source |
| --- | --- | --- | --- |
| Immediate observable behavior | … | `Primary source` / `Local reproduction` / `Unproven` | … |
| Internal state transition | … | … | … |
| Persistent / lifecycle behavior | … | … | … |
Use this full inventory section for strict mode:
## Behavior contract inventory (include when the change touches existing behavior)
| Bucket | Entry | Evidence | Source | Feeds equivalence dimension(s) |
| --- | --- | --- | --- | --- |
| Immediate observable behavior | … | `Primary source` / `Local reproduction` / `Unproven` | … | 1, 9, … |
| Internal state transition | … | … | … | 2, 3, 4, … |
| Persistent / lifecycle behavior | … | … | … | 5, 6, 8, … |
In both modes, every inventory row must carry exactly one of Primary source, Local reproduction, or Unproven. Unproven rows are triaged in the same way as any other Unproven item (risk, phase relevance, next review point). Empty bucket headings to satisfy structure are forbidden. The inventory and equivalence analysis may be omitted only when the current slice provably does not touch existing behavior — record the not-applicable rationale with an evidence class (e.g. "Local reproduction: confirmed by reading source that this slice introduces a new code path with no existing-behavior intersection"). Refactors, migrations, and internal implementation changes count as touching existing behavior under the Non-Negotiable Rule and stay under the inventory + equivalence rule even when most rows collapse to Not applicable with a documented rationale.
The behavioral equivalence section that follows the inventory must visibly anchor each dimension classification to inventory rows. A dimension cannot be classified Equivalent against an Unproven inventory entry; it is Unknown until the entry is proven.
Use this compact equivalence section for light mode:
## Behavioral equivalence (include when the change touches existing behavior)
| Dimension | Scope | Classification | Evidence | Rationale (if Not applicable or Changed) | Test or proof check | Inventory row(s) |
For `Changed (in scope)` dimensions (required in both modes):
- Requirement or user approval reference:
- Changed success criteria:
- Impact explanation:
- Corresponding test:
Use this full equivalence section for strict mode:
## Behavioral equivalence analysis (include when the change touches existing behavior)
| Dimension | Scope (in scope for change / must preserve) | Classification | Evidence or source | Rationale (for Not applicable / Changed) | Inventory row(s) |
For `Changed (in scope)` dimensions (required in both modes):
- Requirement or user approval reference:
- Changed success criteria:
- Impact explanation:
- Corresponding test:
The Changed (in scope) metadata (requirement or user approval reference, changed success criteria, impact explanation, corresponding test) is required regardless of mode. Do not omit it in light mode.
If any Unproven item exists, the Implementation plan section may contain only proof-gathering and test-preparation work. Do not include production changes that assume the missing proof will succeed.
If the user explicitly accepts a remaining blocker, append this optional section in either mode:
## Acknowledged risks for handoff
- Item:
- Why it remains unproven:
- User rationale:
- Impact area:
- Owner or decision-maker:
- Revisit trigger:
This section documents the exception request; it does not change the stop condition inside this skill.
Communication Style
- Use plain language first. Introduce jargon only when needed and define it once.
- Explain why each alternative exists and what would make one option safer than another.
- Make blockers obvious. Do not bury them at the end.
- Ask only the minimum blocking question after exploration is exhausted.
- If several questions resolve the same decision, they may be batched into one turn.
- Do not ask about defaults that can be chosen safely for the current phase.
- Keep blockers adjacent to the recommendation, not pages away from it.
- Be explicit about what is known, how it was verified, and what is still missing.
References
- Evidence classes and allowed phrasing:
references/evidence-rubric.md - Replacement, restoration, and rollback checks:
references/change-recovery-checklist.md - Design comparison rules, recommendation framing, and self-checks:
references/design-exploration-rules.md - Behavioral equivalence analysis for changes touching existing behavior:
references/behavioral-equivalence-analysis.md - Behavior contract inventory built before equivalence analysis:
references/behavior-contract-inventory.md - Plan-content classification, success-criteria freeze, plan-body firewall, and completion gate:
references/plan-boundary-controls.md - Selective failure-pattern checks for high-risk surfaces:
references/failure-pattern-checklist.md
More from adhi-jp/agent-skills
review-scope-guard
Use when Codex or code/plan review findings need Definition-of-Done scope triage, especially to separate must-fix issues from minimal hygiene, out-of-scope hardening, repeated noise, or self-induced refinements. Also use when codex-review-cycle invokes scope guard between validity checking and summary render. Do not use for single-shot lint or unrelated changes.
28codex-review-cycle
Use when the user explicitly asks to run an iterative Codex review cycle on a non-empty git diff, including working-tree changes, current branch vs base, explicit commit/tag/branch refs, code diffs, or Markdown planning documents. Do not use for one-shot reviews, auto-hardening, background checks, plan drafting, or empty review targets.
28writing-style-guide
Use when generating or editing user-facing prose, including docs, comments, READMEs, changelogs, commit messages, PR descriptions, and chat replies.
25minecraft-modding-workbench
>
18review-fix-cascade-guard
Use when applying user-selected review findings after scope triage, especially inside codex-review-cycle or after a standalone Codex adversarial review, and a line-scoped fix may create follow-on valid findings. Do not use for ordinary edits, plan drafting, single-shot lint, no-review contexts, or generic refactor checklists.
13vibe-planning
>
11