critique-product
Critique Product
Preparation
Before anything else, invoke /product-thinking to load strategic context, user segments, and current product landscape.
Then read ALL context files:
.acumen.md— product strategy and positioning.acumen/competitors.md— competitive landscape.acumen/personas.md— user segments and behaviors.acumen/features.md— current feature catalog./DESIGN.md— visual identity contract (only when critiquing UI-touching artifacts; if missing, flag it)
A critique without context is just opinion. You need to know what the product is, who it serves, and what it's competing against before you can evaluate whether an artifact is good.
Mindset
Read this as if you're the engineer who has to build it, the exec who has to fund it, and the designer who has to make it real. Does it survive all three?
The engineer asks: "Can I scope work from this? Are the edge cases covered? Will I have to come back with 40 clarifying questions?"
The exec asks: "Why should I fund this over the other six things competing for the same resources? What's the expected return?"
The designer asks: "Do I know who this is for? Is the problem clear enough that I can design a solution without guessing?"
If any of them would struggle, the artifact needs work.
Core Reflex
Step 1: AI Product Slop Test
Run the slop check first. Scan for these 12 tells — each one weakens the document:
- "Users want..." without evidence (who said this? how many? when?)
- Missing baselines (targets without current numbers are wishes)
- Vague scope ("and more," "etc.," "additional features" — scope creep hiding in plain sight)
- No prioritization rationale (why THIS order? why NOW?)
- Solutions masquerading as problems ("users need a dashboard" is a solution; what's the actual problem?)
- Missing edge cases (what happens when it fails? when data is missing? when the user does something unexpected?)
- Competitor blindness (no mention of alternatives users have today)
- Metric-free success criteria ("improve the experience" — how would you know?)
- One-size-fits-all personas ("our users" instead of specific segments with different needs)
- No tradeoff acknowledgment (everything is upside, nothing is cost)
- Jargon without definition (acronyms and internal terms that assume shared context)
- Missing timeline or sequencing (what ships first? what depends on what?)
Count the tells. More than 4 is a problem. More than 7 means the artifact needs a rewrite, not a revision.
Step 2: Score Across 9 Dimensions
Score each dimension 0-4. Be honest. A 4 means genuinely excellent. Most artifacts score 18-28 out of 36.
| # | Dimension | What you're evaluating | 0 | 2 | 4 |
|---|---|---|---|---|---|
| 1 | Problem clarity | Is the problem specific and evidence-based? | Problem not stated or is a disguised solution | Problem stated but vague or missing evidence | Specific problem with user evidence and quantified impact |
| 2 | User specificity | Do you know exactly who this is for? | "Users" with no segmentation | Named segment but generic description | Specific persona with behaviors, context, and needs |
| 3 | Metric rigor | Are there baselines, targets, and measurement plans? | No metrics or unmeasurable goals | Metrics named but missing baselines or targets | Full metric tree with baselines, targets, method, and owner |
| 4 | Scope discipline | Is in/out clear with reasoning? | No scope boundaries | In-scope listed but no out-of-scope or rationale | Clear in/out with reasoning for each boundary decision |
| 5 | Strategic coherence | Does this connect to a larger thesis? | No strategic connection | Mentions strategy but link is hand-wavy | Clear causal chain from strategy to this initiative |
| 6 | Edge case coverage | What breaks? | No edge cases considered | Some edge cases but major gaps | Comprehensive edge cases with mitigation or deferral rationale |
| 7 | Communication quality | Can all three audiences act on this? | Confusing or audience-unclear | One audience served well, others neglected | Engineer, exec, and designer can each extract what they need |
| 8 | Slop score | How many of the 12 AI tells are present? | 7+ tells | 3-6 tells | 0-2 tells |
| 9 | Context grounding | References competitors, personas, feedback? | Operating in a vacuum | Some references but shallow | Deeply grounded in competitive reality, persona data, and user signal |
Reference pm-heuristics for the full scoring rubric and calibration examples.
Step 3: Persona-Based Review
Reference reviewer-personas to evaluate the artifact from each reviewer's perspective. Flag where specific audiences would get stuck or push back.
Step 4: Assumption Validation
After scoring, identify the riskiest assumptions in the artifact — the ones that, if wrong, make everything else irrelevant. For each:
- State the assumption as a falsifiable hypothesis ("We believe [persona] will [behavior] because [reason]")
- Check existing signal — does feedback, analytics, or prior research already validate or invalidate this? Don't propose a test for something you already know.
- Match method to assumption type:
- Behavioral (will they do X?) — observe actions, not words. Analytics, session recordings.
- Preference (do they want X over Y?) — ask with commitment attached. "Would you pay?" has signal.
- Technical feasibility (can we build X?) — prototype or spike, timeboxed.
- Market demand (does anyone care?) — fake door tests, landing pages, waitlists.
- Define decision criteria before the test: "If we see X, we proceed. If Y, we pivot. If Z, we kill it."
If all key assumptions already have strong signal, say so — validation is not always needed. If the artifact is based on assumptions with no evidence, flag that as the primary issue.
Step 5: Priority Issues
For each issue found, assign severity and suggest the Acumen command that would fix it:
- P0 — Blocks shipping. The artifact cannot be acted on until this is resolved.
- P1 — Significant gap. Will cause rework or misalignment if not addressed.
- P2 — Weakness. Makes the artifact less effective but doesn't block.
- P3 — Polish. Would make it better but not urgent.
Available fix commands: /diagnose, /measure, /scout, /persona, /features, /brand, /narrate, /workshop, /increment, /roadmap, /orientation, /defensibility
For UI-touching artifacts: check that referenced design tokens ({colors.x}, button-primary) actually resolve in DESIGN.md. Flag prose styling ("a blue button") as a P2 — it should name a token instead.
Output Format
Structure your response as:
Slop Check
Pass or fail, with the specific tells found and quoted evidence from the artifact.
Context Grounding Assessment
How well does this artifact reference the product's actual competitive landscape, personas, and existing user feedback?
Score Table
| Dimension | Score (0-4) | Key Finding |
|---|---|---|
| Problem clarity | ||
| User specificity | ||
| Metric rigor | ||
| Scope discipline | ||
| Strategic coherence | ||
| Edge case coverage | ||
| Communication quality | ||
| Slop score | ||
| Context grounding | ||
| Total | /36 |
What's Working
Two to three specific strengths. Quote the artifact where it's strong.
Riskiest Assumptions
For each (up to 3):
- Assumption: [falsifiable hypothesis]
- Existing signal: [what we already know]
- Validation method: [if needed — or "Already validated" / "Low risk"]
- Decision criteria: [go/pivot/kill thresholds]
Priority Issues
For each issue:
- What: the specific problem
- Why it matters: the consequence if not fixed
- Fix: concrete suggestion
- Command: which Acumen skill would address this
Verdict
One of three:
- Ship it — minor polish needed, fundamentally sound, assumptions are validated or low-risk
- Tighten it — good bones, but gaps that will cause problems downstream. Validate before building.
- Rethink it — structural issues that revision won't fix; needs a different approach
More from vgrss/acumen
features
Build and maintain a lightweight feature inventory. Use after shipping, when planning, or to understand the current product surface.
8measure
Check KPI health — what's working, what's not, where to dig deeper. Suggests /workshop for opportunities. Use for metric reviews, health checks, or when something feels off in the numbers.
8diagnose
Find current problems in the product based on data, value delivery for main personas, and current features. Suggests /workshop for ideation on solutions.
8scout
Build and maintain a living competitor map. Supports both quick scans and deep competitive analysis. Use when entering a new market, a competitor launches something, before scoping a feature, or for strategic positioning decisions.
8persona
Build and maintain behavioral personas grounded in real user patterns. Use when launching, after user research, or when user understanding feels stale.
8narrate
Write product communication for a specific audience, including when stakes are high and stakeholders disagree. Use for exec summaries, eng briefs, customer announcements, or tradeoff negotiations.
8