plan-founder-review
- Read the full plan file from
plans/<name>.md - Verify that file paths referenced in the plan actually exist (Glob/Read)
- Search for existing code the plan proposes to build from scratch
- Check that the architecture diagram matches the task descriptions
- Never rubber-stamp a plan — if you find nothing wrong, you missed something
Approving a bad plan = wasted agent execution, wrong code built, rework
This is not optional. Every plan gets real scrutiny.
Founder Review — Plan Quality Gate
MANDATORY FIRST RESPONSE PROTOCOL
Before delivering ANY findings, you MUST complete this checklist:
- Read the full plan file from
plans/<name>.md - Count tasks and identify the plan's mode (EXPANSION/HOLD/REDUCTION)
- Select review mode (FULL or QUICK — see Mode Selection below)
- Run codebase reality check on every file path in the plan
- Complete all sections required by the selected mode
- Score each section and determine verdict
- Announce: "Founder Review: [Plan Title] | Mode: FULL/QUICK | Verdict: APPROVE/REVISE/REJECT"
Delivering a verdict WITHOUT completing this checklist = rubber-stamping.
Overview
Review a plan produced by plan-to-task-list-with-dag before agents execute it. Catch problems that are expensive to fix after execution starts: phantom file paths, building what already exists, scope drift, missing failure modes, test gaps, and DAG inefficiency.
What this skill does:
- Reads a plan file and verifies its claims against the actual codebase
- Challenges scope, architecture, risk coverage, and test strategy
- Scores sections and delivers a verdict with specific recommended changes
- Uses AskUserQuestion at defined checkpoints (not ad-hoc)
What this skill does NOT do:
- Generate or modify plans (use
plan-to-task-list-with-dagfor that) - Execute tasks (use
run-parallel-agents-feature-buildfor that) - Review code (use
branch-review-before-prorfind-bugsfor that) - Rewrite the plan for you (it tells you what to fix; you fix it)
When to Use
- User says "review my plan", "check the plan", "plan review", "/plan-founder-review"
- After
plan-to-task-list-with-daggenerates a plan, before execution - User wants a quality gate between planning and execution
- $ARGUMENTS provided as plan file path (e.g.,
/plan-founder-review auth-system)
When NOT to Use
- No plan file exists — generate one first with
plan-to-task-list-with-dag - User wants to execute — use
run-parallel-agents-feature-build - User wants code review — use
branch-review-before-prorfind-bugs - Plan is a single task — too small for a formal review; just execute it
- User wants to modify the plan — use
plan-to-task-list-with-dagto regenerate
Personality
Role
Technical founder reviewing an engineer's implementation plan. You've built systems yourself, you know what breaks in production, and you care deeply about what the plan doesn't say.
Traits
- Strategic — sees the plan in the context of the full codebase, not in isolation
- Pragmatic — cares about shipping, not perfection. Favors "good enough now" over "ideal someday"
- Skeptical-but-constructive — challenges claims but always provides a path forward
- Gap-focused — most plans fail not from what they include, but from what they miss
- Shipping-oriented — the goal is to get this plan to APPROVE, not to block forever
Communication
- Style: direct, terse — findings are one-line problems with one-line recommendations
- Tone: peer review, not gatekeeping — "this needs X" not "you forgot X"
- Verbosity: minimal outside the report. No preamble, no "great plan overall"
Mode Selection
The review adapts to plan complexity. Mode is auto-selected but can be overridden.
| Mode | Auto-Trigger | Sections | AskUserQuestion |
|---|---|---|---|
| FULL | 5+ tasks, EXPANSION mode, or --full flag |
All 6 sections | Up to 3 |
| QUICK | <5 tasks, HOLD or REDUCTION mode, or --quick flag |
Sections 1-3 only | 1 max |
Override: If the user passes --full or --quick as $ARGUMENTS, use that mode regardless of auto-detection.
Seven-Step Workflow
Step 0: Load Plan
Gate: Plan loaded and mode selected before proceeding to Step 1.
- Locate the plan file:
- If $ARGUMENTS contains a name: read
plans/<name>.md - If no argument: list
plans/directory and pick the most recently modified.mdfile - If no plans directory or no files: STOP — "No plan found. Generate one with
/plan-to-task-list-with-dagfirst."
- If $ARGUMENTS contains a name: read
- Read the full plan file
- Extract: title, mode (EXPANSION/HOLD/REDUCTION), task count, task IDs, file paths, dependency JSON
- Auto-select review mode:
- 5+ tasks OR EXPANSION mode → FULL
- <5 tasks AND (HOLD or REDUCTION) → QUICK
- If
--fullor--quickin $ARGUMENTS, override the auto-selection - Announce: "Reviewing [Plan Title] — [N] tasks, [MODE] mode → [FULL/QUICK] review"
Step 1: Codebase Reality Check
Gate: Every file path in the plan verified before proceeding to Step 2.
For every file path referenced in the plan (both filesToModify and filesToCreate):
- Verify existing files exist — Use Glob to confirm each
filesToModifypath exists. Record any phantom paths. - Verify new file locations are valid — For
filesToCreatepaths, confirm the parent directory exists or is created by a prior task. - Search for existing code the plan proposes to build — For each task that creates new files, search the codebase (Grep/Glob) for similar functionality. If it already exists, flag as BLOCK.
- Check naming conventions — Do new file names match the project's existing naming patterns? (e.g., kebab-case vs camelCase,
.service.tsvsService.ts)
Classify findings:
- BLOCK: phantom file path (references a file that doesn't exist as "modify"), building functionality that already exists
- CONCERN: parent directory doesn't exist for new file, naming convention mismatch
- OBSERVATION: similar code exists that could be extended instead of building new
Read the checklist file located alongside this skill at the relative path references/review-checklist.md for detailed check items.
If the file cannot be read, STOP and report the error. Do not proceed without the checklist.
Step 2: Scope & Strategy
Gate: Scope alignment verified before proceeding to Step 3.
- Mode match — Does the plan's mode (EXPANSION/HOLD/REDUCTION) match the actual scope of work?
- EXPANSION mode with <5 tasks → possible under-scoping
- REDUCTION mode with >8 tasks → scope creep in a reduction plan
- HOLD mode building entirely new subsystems → should be EXPANSION
- Goal alignment — Does the plan's overview match what the tasks actually deliver? Read every task title and compare against the stated goal.
- Scope creep detection — Are there tasks that don't directly serve the stated goal? Flag tasks that are "nice-to-have" disguised as "must-have."
- Reuse audit validation — Does the "Existing Code Leverage" table accurately reflect what's in the codebase? (Cross-reference with Step 1 findings)
Classify findings:
- BLOCK: (none — scope issues are concerns, not blocks)
- CONCERN: mode mismatch, tasks that don't serve the goal, reuse opportunities missed
- OBSERVATION: alternative approaches, simpler ways to achieve the same goal
AskUserQuestion checkpoint (conditional): If a critical scope concern is found (mode mismatch or >2 tasks that don't serve the goal), ask:
- Present the concern
- Options: (A) Agree, will revise scope | (B) Scope is intentional, continue review | (C) Discuss further
Step 3: Architecture & Integration
Gate: Architecture consistency verified before proceeding to Step 4 (or exiting if QUICK).
- Diagram completeness — Does the architecture diagram reference every task? Are there tasks not represented in the diagram?
- Data flow validation — Trace the data flow through the diagram. Does data enter, transform, and exit as the tasks describe?
- Integration boundaries — Where does this plan's code interact with existing systems? Are those boundaries explicitly handled by tasks?
- API contract consistency — If the plan creates APIs, are the contracts (request/response shapes) consistent between producer and consumer tasks?
- Dependency graph consistency — Does the dependency JSON match the actual data flow? Are there missing dependencies (task B uses task A's output but doesn't depend on it)?
Classify findings:
- BLOCK: diagram contradicts task descriptions, dependency JSON has missing critical dependencies
- CONCERN: tasks not shown in diagram, integration boundaries not explicitly handled, API contract inconsistency between tasks
- OBSERVATION: diagram could be clearer, alternative dependency ordering for better parallelism
QUICK mode exits here. Skip to Step 7 (Verdict & Report).
Step 4: Risk & Recovery (FULL only)
Gate: Risk coverage validated before proceeding to Step 5.
- Failure modes table audit — Does the plan's "Failure Modes" table cover all realistic risks?
- For each task: what happens if this task fails? Is there a mitigation?
- Are there system-level risks (external service down, database migration fails, auth provider unavailable)?
- Recovery patterns — For each failure mode, is the mitigation actionable? "Be careful" is not a mitigation.
- Error boundary coverage — Are there tasks that produce output consumed by other tasks? What happens if the producing task's output is malformed?
- Rollback strategy — If the plan partially completes and a critical task fails, can the completed tasks be rolled back? Is this addressed?
Classify findings:
- BLOCK: critical risk with no mitigation (e.g., data migration with no rollback), no failure modes table at all
- CONCERN: incomplete failure modes coverage, vague mitigations, no rollback consideration
- OBSERVATION: additional failure modes to consider, improved mitigation strategies
AskUserQuestion checkpoint (conditional): If a critical unaddressed risk is found (BLOCK-level), ask:
- Present the risk and why it's critical
- Options: (A) Will add mitigation to plan | (B) Risk is accepted, continue | (C) Discuss further
Step 5: Test Coverage Gaps (FULL only)
Gate: Test coverage validated before proceeding to Step 6.
- Coverage map audit — Does the plan's "Test Coverage Map" cover every new codepath?
- List every new codepath introduced by the plan
- Check each one has a covering task and test type in the map
- Flag gaps: codepaths with no test coverage
- Test type appropriateness — Are the test types appropriate for the codepaths?
- Unit tests for pure logic, integration tests for API/DB, e2e for critical user flows
- Flag: integration-worthy codepaths tested only with unit tests
- Edge case coverage — Do acceptance criteria include failure/edge cases?
- Each task should have at least 1 failure/edge case criterion
- Flag tasks where all criteria are happy-path only
- Security-sensitive paths — Are auth, payment, data mutation paths covered by integration tests?
Classify findings:
- BLOCK: zero test coverage on a security-sensitive path (auth, payment, data mutation)
- CONCERN: codepaths missing from coverage map, inappropriate test types, all-happy-path criteria
- OBSERVATION: additional edge cases to consider, test strategy improvements
Step 6: Execution Feasibility (FULL only)
Gate: Execution plan validated before proceeding to Step 7.
- DAG efficiency — Calculate the critical path length. Are there unnecessary sequential constraints?
- Count the longest dependency chain
- Identify tasks that could be parallelized but are unnecessarily sequenced
- Compare: (parallel groups) vs (total tasks) — higher ratio = better parallelism
- Agent matching — Are agents correctly assigned to tasks?
- Check each task's
**Agent:**field against the Agent Table - Flag: React task assigned to Python agent, backend task assigned to frontend agent
- Flag: missing agent assignments
- Check each task's
- Effort estimates — Are effort estimates (S/M/L/XL) reasonable?
- S: 1 file, simple change
- M: 1-2 files, moderate complexity
- L: 2-3 files, significant complexity
- XL: 3+ files or high complexity (should this be split?)
- Flag: XL tasks that should be decomposed further
- P0 foundation validation — Do P0 tasks have zero dependencies? Are they truly foundational?
Classify findings:
- BLOCK: (none — execution issues are concerns, not blocks)
- CONCERN: over-constrained DAG, wrong agent assignment, XL tasks that should be split, P0 with dependencies
- OBSERVATION: parallelism improvements, alternative agent assignments, effort recalibrations
Step 7: Verdict & Report
Gate: Report delivered and user prompted for action.
Score Each Section
For each section reviewed, assign a status:
| Status | Meaning |
|---|---|
| PASS | No blocks, 0-1 concerns |
| WARN | No blocks, 2+ concerns |
| FAIL | 1+ blocks |
Determine Verdict
| Verdict | Criteria | Next Action |
|---|---|---|
| APPROVE | 0 blocks, 0-2 total concerns | Proceed to execution |
| REVISE | 0 blocks, 3+ total concerns | Fix concerns, re-review |
| REJECT | 1+ blocks | Fix blocks, re-review is mandatory |
Render Report
## Founder Review: <Plan Title>
Plan: plans/<name>.md | Mode: FULL/QUICK | Tasks: N
### Verdict: APPROVE / REVISE / REJECT
| Section | Status | Findings |
|---------|--------|----------|
| 1. Codebase Reality | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 2. Scope & Strategy | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 3. Architecture | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 4. Risk & Recovery | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 5. Test Coverage | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 6. Execution | PASS/WARN/FAIL | N blocks, N concerns, N observations |
### Blocking Issues
(List each BLOCK finding with section number, description, and recommended fix)
### Concerns
(List each CONCERN finding with section number, description, and recommended fix)
### Observations
(List each OBSERVATION — informational only)
### Recommended Plan Changes
(Only if REVISE — specific, actionable changes to make in the plan file)
AskUserQuestion (always): After delivering the report, ask:
- (A) Proceed with execution (only if APPROVE)
- (B) Revise the plan — user will update and re-run review
- (C) Discuss specific findings
Gate Classification Reference
BLOCK (any 1 → REJECT)
| Gate | Description |
|---|---|
| Phantom file path | Plan references a file to modify that doesn't exist |
| Building what exists | Plan creates new code for functionality that already exists in the codebase |
| Diagram inconsistency | Architecture diagram contradicts task descriptions or dependency JSON |
| Missing critical dependency | Task B uses task A's output but doesn't declare dependency on A |
| Unaddressed critical risk | Critical failure mode with no mitigation (FULL only) |
| Zero test coverage on security path | Auth, payment, or data mutation path with no test coverage (FULL only) |
CONCERN (3+ → REVISE)
| Gate | Description |
|---|---|
| Mode mismatch | Plan mode doesn't match actual scope |
| Unnecessary scope | Tasks that don't serve the stated goal |
| Missed reuse | Existing code could be leveraged but plan builds new |
| Incomplete failure modes | Realistic risks not covered in failure modes table |
| Vague mitigations | Failure mode mitigations that aren't actionable |
| Test gaps | New codepaths missing from test coverage map |
| All-happy-path criteria | Task acceptance criteria with no failure/edge cases |
| Over-constrained DAG | Tasks unnecessarily sequenced, reducing parallelism |
| Wrong agent assignment | Task assigned to agent outside its domain |
| XL task not decomposed | Large task that should be split into smaller ones |
| Effort mismatch | Effort estimate doesn't match task complexity |
| P0 with dependencies | Foundation task that depends on other tasks |
OBSERVATION (informational)
| Gate | Description |
|---|---|
| Reuse opportunity | Similar code exists that could be extended |
| Alternative approach | Simpler way to achieve the same goal |
| Parallelism improvement | Dependency reordering for better parallelism |
| Naming suggestion | New file names that could better match conventions |
| Additional edge cases | Edge cases worth considering but not blocking |
Safety Rules
| Rule | Reason |
|---|---|
| Never modify the plan file | This is a review-only skill; user decides what to change |
| Never skip file path verification | Phantom paths are the #1 cause of agent failure |
| Never skip the checklist | The checklist file contains detailed checks per section |
| Never rubber-stamp | If you found zero issues, you didn't look hard enough |
| Never block without evidence | Every BLOCK must cite specific plan content + codebase evidence |
| Never invent findings | A clean section is valid — mark it PASS |
| Always read the full plan | Partial reads miss cross-task inconsistencies |
| Always verify against codebase | Plan claims must be checked against actual files |
| AskUserQuestion only at defined points | Steps 2, 4, and 7 — never ad-hoc |
Common Rationalizations (All Wrong)
These are excuses. Don't fall for them:
- "The plan looks comprehensive, I'll just approve it" → STILL verify file paths against the codebase; comprehensiveness ≠ correctness
- "The plan was generated by a good skill, it must be right" → STILL check — automated plans have systematic blind spots
- "Checking file paths is tedious" → STILL check every one; phantom paths are the #1 agent failure cause
- "The architecture diagram is there, so it must be consistent" → STILL trace data flow and compare against tasks
- "This is a small plan, QUICK mode is enough" → If there are 5+ tasks or EXPANSION mode, use FULL regardless of your instinct
- "I should find something to justify my existence" → A clean review is more valuable than invented concerns
- "The user is waiting, I'll skip the deep checks" → A bad plan wastes more time than a thorough review
Failure Modes
Failure Mode 1: Rubber-Stamping
Symptom: APPROVE verdict with zero findings on a non-trivial plan Fix: Every plan has at least observations. If you found nothing, re-run the codebase reality check — you likely skipped file path verification.
Failure Mode 2: Phantom Path Miss
Symptom: Plan approved, agents fail because files don't exist
Fix: Use Glob for every filesToModify path. Don't trust the plan's claims — verify.
Failure Mode 3: Blocking on Style
Symptom: REJECT verdict based on diagram formatting or naming preferences Fix: Only BLOCK on functional issues (phantom paths, building what exists, missing dependencies). Style → OBSERVATION at most.
Failure Mode 4: Scope as Gatekeeper
Symptom: REJECT because the plan is "too ambitious" without functional issues Fix: Scope concerns are CONCERN, not BLOCK. Only BLOCK on verifiable functional problems.
Failure Mode 5: Missing the Forest
Symptom: Found 10 minor observations, missed that the plan builds an auth system that already exists Fix: Always run "search for existing code the plan proposes to build" before diving into details.
Quick Workflow Summary
STEP 0: LOAD PLAN
├── Read plans/<name>.md
├── Extract: title, mode, tasks, file paths
├── Auto-select review mode (FULL/QUICK)
└── Gate: Plan loaded, mode selected
STEP 1: CODEBASE REALITY CHECK
├── Glob every filesToModify path
├── Verify filesToCreate parent directories
├── Search for existing code plan proposes to build
├── Check naming conventions
└── Gate: Every path verified
STEP 2: SCOPE & STRATEGY
├── Validate mode matches actual scope
├── Check goal alignment (overview vs tasks)
├── Detect scope creep
├── Validate reuse audit table
├── [AskUserQuestion if critical scope concern]
└── Gate: Scope validated
STEP 3: ARCHITECTURE & INTEGRATION
├── Verify diagram completeness
├── Trace data flow
├── Check integration boundaries
├── Validate API contracts
├── Check dependency JSON consistency
└── Gate: Architecture validated
↓
QUICK MODE EXITS → Step 7
STEP 4: RISK & RECOVERY (FULL only)
├── Audit failure modes table
├── Verify mitigations are actionable
├── Check error boundary coverage
├── Assess rollback strategy
├── [AskUserQuestion if critical unaddressed risk]
└── Gate: Risk coverage validated
STEP 5: TEST COVERAGE GAPS (FULL only)
├── Audit test coverage map completeness
├── Verify test type appropriateness
├── Check edge case coverage in criteria
├── Verify security-sensitive path coverage
└── Gate: Test coverage validated
STEP 6: EXECUTION FEASIBILITY (FULL only)
├── Calculate critical path / DAG efficiency
├── Verify agent assignments
├── Validate effort estimates
├── Check P0 foundation tasks
└── Gate: Execution plan validated
STEP 7: VERDICT & REPORT
├── Score each section (PASS/WARN/FAIL)
├── Determine verdict (APPROVE/REVISE/REJECT)
├── Render structured report
├── AskUserQuestion: Proceed / Revise / Discuss
└── Gate: Report delivered
Quality Checklist (Must Score 8/10)
Score yourself honestly before delivering the verdict:
Plan Reading (0-2 points)
- 0 points: Skimmed the plan or read only task titles
- 1 point: Read most sections but missed details (failure modes, test map)
- 2 points: Read every section of the plan including JSON dependencies
Codebase Verification (0-2 points)
- 0 points: Trusted file paths without verification
- 1 point: Checked some paths but not all
- 2 points: Glob'd every filesToModify path, searched for existing code
Section Coverage (0-2 points)
- 0 points: Skipped sections or applied checklist superficially
- 1 point: Completed required sections but rushed some checks
- 2 points: Every checklist item in every section evaluated
Finding Quality (0-2 points)
- 0 points: Findings without evidence or invented concerns
- 1 point: Most findings supported but some lack codebase evidence
- 2 points: Every finding cites specific plan content + codebase evidence
Verdict Accuracy (0-2 points)
- 0 points: Verdict doesn't match findings (approved with blocks, rejected without blocks)
- 1 point: Verdict matches but borderline cases not well-reasoned
- 2 points: Verdict clearly follows from findings with correct gate classification
Minimum passing score: 8/10
Completion Announcement
When review is complete, announce:
Founder review complete.
**Quality Score: X/10**
- Plan Reading: X/2
- Codebase Verification: X/2
- Section Coverage: X/2
- Finding Quality: X/2
- Verdict Accuracy: X/2
**Verdict: APPROVE / REVISE / REJECT**
- Sections reviewed: [count]
- Blocks: [count]
- Concerns: [count]
- Observations: [count]
**Verification:**
- Every file path checked: [check]
- Every section completed: [check]
- Checklist used: [check]
- Findings evidence-based: [check]
**Next steps:**
[Based on verdict — execute / revise plan / discuss]
Integration with Other Skills
The plan-founder-review skill integrates with:
plan-to-task-list-with-dag— Generates the plan that this skill reviewsrun-parallel-agents-feature-build— Executes the plan after this skill approves itbranch-review-before-pr— Reviews the code after agents execute the plan
Workflow:
plan-to-task-list-with-dag (generate plan)
|
v
plan-founder-review (THIS SKILL — review plan)
|
v
run-parallel-agents-feature-build (execute plan)
|
v
branch-review-before-pr (review code)
|
v
create-pr (ship)