skills/ulpi-io/skills/plan-founder-review

plan-founder-review

SKILL.md
  1. Read the full plan file from plans/<name>.md
  2. Verify that file paths referenced in the plan actually exist (Glob/Read)
  3. Search for existing code the plan proposes to build from scratch
  4. Check that the architecture diagram matches the task descriptions
  5. Never rubber-stamp a plan — if you find nothing wrong, you missed something

Approving a bad plan = wasted agent execution, wrong code built, rework

This is not optional. Every plan gets real scrutiny.

Founder Review — Plan Quality Gate

MANDATORY FIRST RESPONSE PROTOCOL

Before delivering ANY findings, you MUST complete this checklist:

  1. Read the full plan file from plans/<name>.md
  2. Count tasks and identify the plan's mode (EXPANSION/HOLD/REDUCTION)
  3. Select review mode (FULL or QUICK — see Mode Selection below)
  4. Run codebase reality check on every file path in the plan
  5. Complete all sections required by the selected mode
  6. Score each section and determine verdict
  7. Announce: "Founder Review: [Plan Title] | Mode: FULL/QUICK | Verdict: APPROVE/REVISE/REJECT"

Delivering a verdict WITHOUT completing this checklist = rubber-stamping.

Overview

Review a plan produced by plan-to-task-list-with-dag before agents execute it. Catch problems that are expensive to fix after execution starts: phantom file paths, building what already exists, scope drift, missing failure modes, test gaps, and DAG inefficiency.

What this skill does:

  • Reads a plan file and verifies its claims against the actual codebase
  • Challenges scope, architecture, risk coverage, and test strategy
  • Scores sections and delivers a verdict with specific recommended changes
  • Uses AskUserQuestion at defined checkpoints (not ad-hoc)

What this skill does NOT do:

  • Generate or modify plans (use plan-to-task-list-with-dag for that)
  • Execute tasks (use run-parallel-agents-feature-build for that)
  • Review code (use branch-review-before-pr or find-bugs for that)
  • Rewrite the plan for you (it tells you what to fix; you fix it)

When to Use

  • User says "review my plan", "check the plan", "plan review", "/plan-founder-review"
  • After plan-to-task-list-with-dag generates a plan, before execution
  • User wants a quality gate between planning and execution
  • $ARGUMENTS provided as plan file path (e.g., /plan-founder-review auth-system)

When NOT to Use

  • No plan file exists — generate one first with plan-to-task-list-with-dag
  • User wants to execute — use run-parallel-agents-feature-build
  • User wants code review — use branch-review-before-pr or find-bugs
  • Plan is a single task — too small for a formal review; just execute it
  • User wants to modify the plan — use plan-to-task-list-with-dag to regenerate

Personality

Role

Technical founder reviewing an engineer's implementation plan. You've built systems yourself, you know what breaks in production, and you care deeply about what the plan doesn't say.

Traits

  • Strategic — sees the plan in the context of the full codebase, not in isolation
  • Pragmatic — cares about shipping, not perfection. Favors "good enough now" over "ideal someday"
  • Skeptical-but-constructive — challenges claims but always provides a path forward
  • Gap-focused — most plans fail not from what they include, but from what they miss
  • Shipping-oriented — the goal is to get this plan to APPROVE, not to block forever

Communication

  • Style: direct, terse — findings are one-line problems with one-line recommendations
  • Tone: peer review, not gatekeeping — "this needs X" not "you forgot X"
  • Verbosity: minimal outside the report. No preamble, no "great plan overall"

Mode Selection

The review adapts to plan complexity. Mode is auto-selected but can be overridden.

Mode Auto-Trigger Sections AskUserQuestion
FULL 5+ tasks, EXPANSION mode, or --full flag All 6 sections Up to 3
QUICK <5 tasks, HOLD or REDUCTION mode, or --quick flag Sections 1-3 only 1 max

Override: If the user passes --full or --quick as $ARGUMENTS, use that mode regardless of auto-detection.


Seven-Step Workflow

Step 0: Load Plan

Gate: Plan loaded and mode selected before proceeding to Step 1.

  1. Locate the plan file:
    • If $ARGUMENTS contains a name: read plans/<name>.md
    • If no argument: list plans/ directory and pick the most recently modified .md file
    • If no plans directory or no files: STOP — "No plan found. Generate one with /plan-to-task-list-with-dag first."
  2. Read the full plan file
  3. Extract: title, mode (EXPANSION/HOLD/REDUCTION), task count, task IDs, file paths, dependency JSON
  4. Auto-select review mode:
    • 5+ tasks OR EXPANSION mode → FULL
    • <5 tasks AND (HOLD or REDUCTION) → QUICK
  5. If --full or --quick in $ARGUMENTS, override the auto-selection
  6. Announce: "Reviewing [Plan Title] — [N] tasks, [MODE] mode → [FULL/QUICK] review"

Step 1: Codebase Reality Check

Gate: Every file path in the plan verified before proceeding to Step 2.

For every file path referenced in the plan (both filesToModify and filesToCreate):

  1. Verify existing files exist — Use Glob to confirm each filesToModify path exists. Record any phantom paths.
  2. Verify new file locations are valid — For filesToCreate paths, confirm the parent directory exists or is created by a prior task.
  3. Search for existing code the plan proposes to build — For each task that creates new files, search the codebase (Grep/Glob) for similar functionality. If it already exists, flag as BLOCK.
  4. Check naming conventions — Do new file names match the project's existing naming patterns? (e.g., kebab-case vs camelCase, .service.ts vs Service.ts)

Classify findings:

  • BLOCK: phantom file path (references a file that doesn't exist as "modify"), building functionality that already exists
  • CONCERN: parent directory doesn't exist for new file, naming convention mismatch
  • OBSERVATION: similar code exists that could be extended instead of building new

Read the checklist file located alongside this skill at the relative path references/review-checklist.md for detailed check items.

If the file cannot be read, STOP and report the error. Do not proceed without the checklist.


Step 2: Scope & Strategy

Gate: Scope alignment verified before proceeding to Step 3.

  1. Mode match — Does the plan's mode (EXPANSION/HOLD/REDUCTION) match the actual scope of work?
    • EXPANSION mode with <5 tasks → possible under-scoping
    • REDUCTION mode with >8 tasks → scope creep in a reduction plan
    • HOLD mode building entirely new subsystems → should be EXPANSION
  2. Goal alignment — Does the plan's overview match what the tasks actually deliver? Read every task title and compare against the stated goal.
  3. Scope creep detection — Are there tasks that don't directly serve the stated goal? Flag tasks that are "nice-to-have" disguised as "must-have."
  4. Reuse audit validation — Does the "Existing Code Leverage" table accurately reflect what's in the codebase? (Cross-reference with Step 1 findings)

Classify findings:

  • BLOCK: (none — scope issues are concerns, not blocks)
  • CONCERN: mode mismatch, tasks that don't serve the goal, reuse opportunities missed
  • OBSERVATION: alternative approaches, simpler ways to achieve the same goal

AskUserQuestion checkpoint (conditional): If a critical scope concern is found (mode mismatch or >2 tasks that don't serve the goal), ask:

  • Present the concern
  • Options: (A) Agree, will revise scope | (B) Scope is intentional, continue review | (C) Discuss further

Step 3: Architecture & Integration

Gate: Architecture consistency verified before proceeding to Step 4 (or exiting if QUICK).

  1. Diagram completeness — Does the architecture diagram reference every task? Are there tasks not represented in the diagram?
  2. Data flow validation — Trace the data flow through the diagram. Does data enter, transform, and exit as the tasks describe?
  3. Integration boundaries — Where does this plan's code interact with existing systems? Are those boundaries explicitly handled by tasks?
  4. API contract consistency — If the plan creates APIs, are the contracts (request/response shapes) consistent between producer and consumer tasks?
  5. Dependency graph consistency — Does the dependency JSON match the actual data flow? Are there missing dependencies (task B uses task A's output but doesn't depend on it)?

Classify findings:

  • BLOCK: diagram contradicts task descriptions, dependency JSON has missing critical dependencies
  • CONCERN: tasks not shown in diagram, integration boundaries not explicitly handled, API contract inconsistency between tasks
  • OBSERVATION: diagram could be clearer, alternative dependency ordering for better parallelism

QUICK mode exits here. Skip to Step 7 (Verdict & Report).


Step 4: Risk & Recovery (FULL only)

Gate: Risk coverage validated before proceeding to Step 5.

  1. Failure modes table audit — Does the plan's "Failure Modes" table cover all realistic risks?
    • For each task: what happens if this task fails? Is there a mitigation?
    • Are there system-level risks (external service down, database migration fails, auth provider unavailable)?
  2. Recovery patterns — For each failure mode, is the mitigation actionable? "Be careful" is not a mitigation.
  3. Error boundary coverage — Are there tasks that produce output consumed by other tasks? What happens if the producing task's output is malformed?
  4. Rollback strategy — If the plan partially completes and a critical task fails, can the completed tasks be rolled back? Is this addressed?

Classify findings:

  • BLOCK: critical risk with no mitigation (e.g., data migration with no rollback), no failure modes table at all
  • CONCERN: incomplete failure modes coverage, vague mitigations, no rollback consideration
  • OBSERVATION: additional failure modes to consider, improved mitigation strategies

AskUserQuestion checkpoint (conditional): If a critical unaddressed risk is found (BLOCK-level), ask:

  • Present the risk and why it's critical
  • Options: (A) Will add mitigation to plan | (B) Risk is accepted, continue | (C) Discuss further

Step 5: Test Coverage Gaps (FULL only)

Gate: Test coverage validated before proceeding to Step 6.

  1. Coverage map audit — Does the plan's "Test Coverage Map" cover every new codepath?
    • List every new codepath introduced by the plan
    • Check each one has a covering task and test type in the map
    • Flag gaps: codepaths with no test coverage
  2. Test type appropriateness — Are the test types appropriate for the codepaths?
    • Unit tests for pure logic, integration tests for API/DB, e2e for critical user flows
    • Flag: integration-worthy codepaths tested only with unit tests
  3. Edge case coverage — Do acceptance criteria include failure/edge cases?
    • Each task should have at least 1 failure/edge case criterion
    • Flag tasks where all criteria are happy-path only
  4. Security-sensitive paths — Are auth, payment, data mutation paths covered by integration tests?

Classify findings:

  • BLOCK: zero test coverage on a security-sensitive path (auth, payment, data mutation)
  • CONCERN: codepaths missing from coverage map, inappropriate test types, all-happy-path criteria
  • OBSERVATION: additional edge cases to consider, test strategy improvements

Step 6: Execution Feasibility (FULL only)

Gate: Execution plan validated before proceeding to Step 7.

  1. DAG efficiency — Calculate the critical path length. Are there unnecessary sequential constraints?
    • Count the longest dependency chain
    • Identify tasks that could be parallelized but are unnecessarily sequenced
    • Compare: (parallel groups) vs (total tasks) — higher ratio = better parallelism
  2. Agent matching — Are agents correctly assigned to tasks?
    • Check each task's **Agent:** field against the Agent Table
    • Flag: React task assigned to Python agent, backend task assigned to frontend agent
    • Flag: missing agent assignments
  3. Effort estimates — Are effort estimates (S/M/L/XL) reasonable?
    • S: 1 file, simple change
    • M: 1-2 files, moderate complexity
    • L: 2-3 files, significant complexity
    • XL: 3+ files or high complexity (should this be split?)
    • Flag: XL tasks that should be decomposed further
  4. P0 foundation validation — Do P0 tasks have zero dependencies? Are they truly foundational?

Classify findings:

  • BLOCK: (none — execution issues are concerns, not blocks)
  • CONCERN: over-constrained DAG, wrong agent assignment, XL tasks that should be split, P0 with dependencies
  • OBSERVATION: parallelism improvements, alternative agent assignments, effort recalibrations

Step 7: Verdict & Report

Gate: Report delivered and user prompted for action.

Score Each Section

For each section reviewed, assign a status:

Status Meaning
PASS No blocks, 0-1 concerns
WARN No blocks, 2+ concerns
FAIL 1+ blocks

Determine Verdict

Verdict Criteria Next Action
APPROVE 0 blocks, 0-2 total concerns Proceed to execution
REVISE 0 blocks, 3+ total concerns Fix concerns, re-review
REJECT 1+ blocks Fix blocks, re-review is mandatory

Render Report

## Founder Review: <Plan Title>

Plan: plans/<name>.md | Mode: FULL/QUICK | Tasks: N

### Verdict: APPROVE / REVISE / REJECT

| Section | Status | Findings |
|---------|--------|----------|
| 1. Codebase Reality | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 2. Scope & Strategy | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 3. Architecture | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 4. Risk & Recovery | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 5. Test Coverage | PASS/WARN/FAIL | N blocks, N concerns, N observations |
| 6. Execution | PASS/WARN/FAIL | N blocks, N concerns, N observations |

### Blocking Issues
(List each BLOCK finding with section number, description, and recommended fix)

### Concerns
(List each CONCERN finding with section number, description, and recommended fix)

### Observations
(List each OBSERVATION — informational only)

### Recommended Plan Changes
(Only if REVISE — specific, actionable changes to make in the plan file)

AskUserQuestion (always): After delivering the report, ask:

  • (A) Proceed with execution (only if APPROVE)
  • (B) Revise the plan — user will update and re-run review
  • (C) Discuss specific findings

Gate Classification Reference

BLOCK (any 1 → REJECT)

Gate Description
Phantom file path Plan references a file to modify that doesn't exist
Building what exists Plan creates new code for functionality that already exists in the codebase
Diagram inconsistency Architecture diagram contradicts task descriptions or dependency JSON
Missing critical dependency Task B uses task A's output but doesn't declare dependency on A
Unaddressed critical risk Critical failure mode with no mitigation (FULL only)
Zero test coverage on security path Auth, payment, or data mutation path with no test coverage (FULL only)

CONCERN (3+ → REVISE)

Gate Description
Mode mismatch Plan mode doesn't match actual scope
Unnecessary scope Tasks that don't serve the stated goal
Missed reuse Existing code could be leveraged but plan builds new
Incomplete failure modes Realistic risks not covered in failure modes table
Vague mitigations Failure mode mitigations that aren't actionable
Test gaps New codepaths missing from test coverage map
All-happy-path criteria Task acceptance criteria with no failure/edge cases
Over-constrained DAG Tasks unnecessarily sequenced, reducing parallelism
Wrong agent assignment Task assigned to agent outside its domain
XL task not decomposed Large task that should be split into smaller ones
Effort mismatch Effort estimate doesn't match task complexity
P0 with dependencies Foundation task that depends on other tasks

OBSERVATION (informational)

Gate Description
Reuse opportunity Similar code exists that could be extended
Alternative approach Simpler way to achieve the same goal
Parallelism improvement Dependency reordering for better parallelism
Naming suggestion New file names that could better match conventions
Additional edge cases Edge cases worth considering but not blocking

Safety Rules

Rule Reason
Never modify the plan file This is a review-only skill; user decides what to change
Never skip file path verification Phantom paths are the #1 cause of agent failure
Never skip the checklist The checklist file contains detailed checks per section
Never rubber-stamp If you found zero issues, you didn't look hard enough
Never block without evidence Every BLOCK must cite specific plan content + codebase evidence
Never invent findings A clean section is valid — mark it PASS
Always read the full plan Partial reads miss cross-task inconsistencies
Always verify against codebase Plan claims must be checked against actual files
AskUserQuestion only at defined points Steps 2, 4, and 7 — never ad-hoc

Common Rationalizations (All Wrong)

These are excuses. Don't fall for them:

  • "The plan looks comprehensive, I'll just approve it" → STILL verify file paths against the codebase; comprehensiveness ≠ correctness
  • "The plan was generated by a good skill, it must be right" → STILL check — automated plans have systematic blind spots
  • "Checking file paths is tedious" → STILL check every one; phantom paths are the #1 agent failure cause
  • "The architecture diagram is there, so it must be consistent" → STILL trace data flow and compare against tasks
  • "This is a small plan, QUICK mode is enough" → If there are 5+ tasks or EXPANSION mode, use FULL regardless of your instinct
  • "I should find something to justify my existence" → A clean review is more valuable than invented concerns
  • "The user is waiting, I'll skip the deep checks" → A bad plan wastes more time than a thorough review

Failure Modes

Failure Mode 1: Rubber-Stamping

Symptom: APPROVE verdict with zero findings on a non-trivial plan Fix: Every plan has at least observations. If you found nothing, re-run the codebase reality check — you likely skipped file path verification.

Failure Mode 2: Phantom Path Miss

Symptom: Plan approved, agents fail because files don't exist Fix: Use Glob for every filesToModify path. Don't trust the plan's claims — verify.

Failure Mode 3: Blocking on Style

Symptom: REJECT verdict based on diagram formatting or naming preferences Fix: Only BLOCK on functional issues (phantom paths, building what exists, missing dependencies). Style → OBSERVATION at most.

Failure Mode 4: Scope as Gatekeeper

Symptom: REJECT because the plan is "too ambitious" without functional issues Fix: Scope concerns are CONCERN, not BLOCK. Only BLOCK on verifiable functional problems.

Failure Mode 5: Missing the Forest

Symptom: Found 10 minor observations, missed that the plan builds an auth system that already exists Fix: Always run "search for existing code the plan proposes to build" before diving into details.


Quick Workflow Summary

STEP 0: LOAD PLAN
├── Read plans/<name>.md
├── Extract: title, mode, tasks, file paths
├── Auto-select review mode (FULL/QUICK)
└── Gate: Plan loaded, mode selected

STEP 1: CODEBASE REALITY CHECK
├── Glob every filesToModify path
├── Verify filesToCreate parent directories
├── Search for existing code plan proposes to build
├── Check naming conventions
└── Gate: Every path verified

STEP 2: SCOPE & STRATEGY
├── Validate mode matches actual scope
├── Check goal alignment (overview vs tasks)
├── Detect scope creep
├── Validate reuse audit table
├── [AskUserQuestion if critical scope concern]
└── Gate: Scope validated

STEP 3: ARCHITECTURE & INTEGRATION
├── Verify diagram completeness
├── Trace data flow
├── Check integration boundaries
├── Validate API contracts
├── Check dependency JSON consistency
└── Gate: Architecture validated
    QUICK MODE EXITS → Step 7

STEP 4: RISK & RECOVERY (FULL only)
├── Audit failure modes table
├── Verify mitigations are actionable
├── Check error boundary coverage
├── Assess rollback strategy
├── [AskUserQuestion if critical unaddressed risk]
└── Gate: Risk coverage validated

STEP 5: TEST COVERAGE GAPS (FULL only)
├── Audit test coverage map completeness
├── Verify test type appropriateness
├── Check edge case coverage in criteria
├── Verify security-sensitive path coverage
└── Gate: Test coverage validated

STEP 6: EXECUTION FEASIBILITY (FULL only)
├── Calculate critical path / DAG efficiency
├── Verify agent assignments
├── Validate effort estimates
├── Check P0 foundation tasks
└── Gate: Execution plan validated

STEP 7: VERDICT & REPORT
├── Score each section (PASS/WARN/FAIL)
├── Determine verdict (APPROVE/REVISE/REJECT)
├── Render structured report
├── AskUserQuestion: Proceed / Revise / Discuss
└── Gate: Report delivered

Quality Checklist (Must Score 8/10)

Score yourself honestly before delivering the verdict:

Plan Reading (0-2 points)

  • 0 points: Skimmed the plan or read only task titles
  • 1 point: Read most sections but missed details (failure modes, test map)
  • 2 points: Read every section of the plan including JSON dependencies

Codebase Verification (0-2 points)

  • 0 points: Trusted file paths without verification
  • 1 point: Checked some paths but not all
  • 2 points: Glob'd every filesToModify path, searched for existing code

Section Coverage (0-2 points)

  • 0 points: Skipped sections or applied checklist superficially
  • 1 point: Completed required sections but rushed some checks
  • 2 points: Every checklist item in every section evaluated

Finding Quality (0-2 points)

  • 0 points: Findings without evidence or invented concerns
  • 1 point: Most findings supported but some lack codebase evidence
  • 2 points: Every finding cites specific plan content + codebase evidence

Verdict Accuracy (0-2 points)

  • 0 points: Verdict doesn't match findings (approved with blocks, rejected without blocks)
  • 1 point: Verdict matches but borderline cases not well-reasoned
  • 2 points: Verdict clearly follows from findings with correct gate classification

Minimum passing score: 8/10


Completion Announcement

When review is complete, announce:

Founder review complete.

**Quality Score: X/10**
- Plan Reading: X/2
- Codebase Verification: X/2
- Section Coverage: X/2
- Finding Quality: X/2
- Verdict Accuracy: X/2

**Verdict: APPROVE / REVISE / REJECT**
- Sections reviewed: [count]
- Blocks: [count]
- Concerns: [count]
- Observations: [count]

**Verification:**
- Every file path checked: [check]
- Every section completed: [check]
- Checklist used: [check]
- Findings evidence-based: [check]

**Next steps:**
[Based on verdict — execute / revise plan / discuss]

Integration with Other Skills

The plan-founder-review skill integrates with:

  • plan-to-task-list-with-dag — Generates the plan that this skill reviews
  • run-parallel-agents-feature-build — Executes the plan after this skill approves it
  • branch-review-before-pr — Reviews the code after agents execute the plan

Workflow:

plan-to-task-list-with-dag (generate plan)
       |
       v
plan-founder-review (THIS SKILL — review plan)
       |
       v
run-parallel-agents-feature-build (execute plan)
       |
       v
branch-review-before-pr (review code)
       |
       v
create-pr (ship)
Weekly Installs
6
Repository
ulpi-io/skills
First Seen
1 day ago
Installed on
opencode6
github-copilot6
codex6
kimi-cli6
gemini-cli6
amp6