Critique Product

Preparation

Before anything else, invoke /product-thinking to load strategic context, user segments, and current product landscape.

Then read ALL context files:

.acumen.md — product strategy and positioning
.acumen/competitors.md — competitive landscape
.acumen/personas.md — user segments and behaviors
.acumen/features.md — current feature catalog
./DESIGN.md — visual identity contract (only when critiquing UI-touching artifacts; if missing, flag it)

A critique without context is just opinion. You need to know what the product is, who it serves, and what it's competing against before you can evaluate whether an artifact is good.

Mindset

Read this as if you're the engineer who has to build it, the exec who has to fund it, and the designer who has to make it real. Does it survive all three?

The engineer asks: "Can I scope work from this? Are the edge cases covered? Will I have to come back with 40 clarifying questions?"

The exec asks: "Why should I fund this over the other six things competing for the same resources? What's the expected return?"

The designer asks: "Do I know who this is for? Is the problem clear enough that I can design a solution without guessing?"

If any of them would struggle, the artifact needs work.

Core Reflex

Step 1: AI Product Slop Test

Run the slop check first. Scan for these 12 tells — each one weakens the document:

"Users want..." without evidence (who said this? how many? when?)
Missing baselines (targets without current numbers are wishes)
Vague scope ("and more," "etc.," "additional features" — scope creep hiding in plain sight)
No prioritization rationale (why THIS order? why NOW?)
Solutions masquerading as problems ("users need a dashboard" is a solution; what's the actual problem?)
Missing edge cases (what happens when it fails? when data is missing? when the user does something unexpected?)
Competitor blindness (no mention of alternatives users have today)
Metric-free success criteria ("improve the experience" — how would you know?)
One-size-fits-all personas ("our users" instead of specific segments with different needs)
No tradeoff acknowledgment (everything is upside, nothing is cost)
Jargon without definition (acronyms and internal terms that assume shared context)
Missing timeline or sequencing (what ships first? what depends on what?)

Count the tells. More than 4 is a problem. More than 7 means the artifact needs a rewrite, not a revision.

Step 2: Score Across 9 Dimensions

Score each dimension 0-4. Be honest. A 4 means genuinely excellent. Most artifacts score 18-28 out of 36.

#	Dimension	What you're evaluating	0	2	4
1	Problem clarity	Is the problem specific and evidence-based?	Problem not stated or is a disguised solution	Problem stated but vague or missing evidence	Specific problem with user evidence and quantified impact
2	User specificity	Do you know exactly who this is for?	"Users" with no segmentation	Named segment but generic description	Specific persona with behaviors, context, and needs
3	Metric rigor	Are there baselines, targets, and measurement plans?	No metrics or unmeasurable goals	Metrics named but missing baselines or targets	Full metric tree with baselines, targets, method, and owner
4	Scope discipline	Is in/out clear with reasoning?	No scope boundaries	In-scope listed but no out-of-scope or rationale	Clear in/out with reasoning for each boundary decision
5	Strategic coherence	Does this connect to a larger thesis?	No strategic connection	Mentions strategy but link is hand-wavy	Clear causal chain from strategy to this initiative
6	Edge case coverage	What breaks?	No edge cases considered	Some edge cases but major gaps	Comprehensive edge cases with mitigation or deferral rationale
7	Communication quality	Can all three audiences act on this?	Confusing or audience-unclear	One audience served well, others neglected	Engineer, exec, and designer can each extract what they need
8	Slop score	How many of the 12 AI tells are present?	7+ tells	3-6 tells	0-2 tells
9	Context grounding	References competitors, personas, feedback?	Operating in a vacuum	Some references but shallow	Deeply grounded in competitive reality, persona data, and user signal

Reference pm-heuristics for the full scoring rubric and calibration examples.

Step 3: Persona-Based Review

Reference reviewer-personas to evaluate the artifact from each reviewer's perspective. Flag where specific audiences would get stuck or push back.

Step 4: Assumption Validation

After scoring, identify the riskiest assumptions in the artifact — the ones that, if wrong, make everything else irrelevant. For each:

State the assumption as a falsifiable hypothesis ("We believe [persona] will [behavior] because [reason]")
Check existing signal — does feedback, analytics, or prior research already validate or invalidate this? Don't propose a test for something you already know.
Match method to assumption type:
- Behavioral (will they do X?) — observe actions, not words. Analytics, session recordings.
- Preference (do they want X over Y?) — ask with commitment attached. "Would you pay?" has signal.
- Technical feasibility (can we build X?) — prototype or spike, timeboxed.
- Market demand (does anyone care?) — fake door tests, landing pages, waitlists.
Define decision criteria before the test: "If we see X, we proceed. If Y, we pivot. If Z, we kill it."

If all key assumptions already have strong signal, say so — validation is not always needed. If the artifact is based on assumptions with no evidence, flag that as the primary issue.

Step 5: Priority Issues

For each issue found, assign severity and suggest the Acumen command that would fix it:

P0 — Blocks shipping. The artifact cannot be acted on until this is resolved.
P1 — Significant gap. Will cause rework or misalignment if not addressed.
P2 — Weakness. Makes the artifact less effective but doesn't block.
P3 — Polish. Would make it better but not urgent.

Available fix commands: /diagnose, /measure, /scout, /persona, /features, /brand, /narrate, /workshop, /increment, /roadmap, /orientation, /defensibility

For UI-touching artifacts: check that referenced design tokens ({colors.x}, button-primary) actually resolve in DESIGN.md. Flag prose styling ("a blue button") as a P2 — it should name a token instead.

Output Format

Structure your response as:

Slop Check

Pass or fail, with the specific tells found and quoted evidence from the artifact.

Context Grounding Assessment

How well does this artifact reference the product's actual competitive landscape, personas, and existing user feedback?

Score Table

Dimension	Score (0-4)	Key Finding
Problem clarity
User specificity
Metric rigor
Scope discipline
Strategic coherence
Edge case coverage
Communication quality
Slop score
Context grounding
Total	/36

What's Working

Two to three specific strengths. Quote the artifact where it's strong.

Riskiest Assumptions

For each (up to 3):

Assumption: [falsifiable hypothesis]
Existing signal: [what we already know]
Validation method: [if needed — or "Already validated" / "Low risk"]
Decision criteria: [go/pivot/kill thresholds]

Priority Issues

For each issue:

What: the specific problem
Why it matters: the consequence if not fixed
Fix: concrete suggestion
Command: which Acumen skill would address this

Verdict

One of three:

Ship it — minor polish needed, fundamentally sound, assumptions are validated or low-risk
Tighten it — good bones, but gaps that will cause problems downstream. Validate before building.
Rethink it — structural issues that revision won't fix; needs a different approach

critique-product