ai-product-canvas
AI Product Canvas Skill
Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.
AI Product Anti-Patterns to Check First
Before building, flag if any of these apply:
- ❌ "We should add AI to [existing feature]" — with no user problem defined
- ❌ Accuracy target undefined before build begins
- ❌ No plan for what happens when the model is wrong
- ❌ User-facing AI output with no human review or fallback
- ❌ Training data not audited for bias or quality
- ❌ No evaluation metric — "we'll know it when we see it"
AI Product Canvas Output Format
AI Product Canvas — [Feature Name] — [Date]
PM Owner: [Name] ML/AI Lead: [Name] Status: Discovery / Design / Build / Evaluation / Live
1. Problem Definition
User problem being solved:
[What specific situation is the user in? What job are they trying to get done?]
Why AI?
[What makes this problem require AI vs a deterministic solution? If the answer is "because we can," stop here.]
Success for the user looks like:
[What outcome does the user experience when the AI feature is working well?]
2. AI Approach
Task type:
- Classification
- Generation (text, image, code)
- Summarisation / extraction
- Recommendation
- Search / retrieval
- Prediction / forecasting
- Conversation / agent
Model approach:
- LLM API (GPT-4, Claude, Gemini, etc.) — specify: [Model name + version]
- Fine-tuned model on own data
- Custom model trained from scratch
- RAG (retrieval-augmented generation)
- Embedding + vector search
Rationale for chosen approach: [Why this, not alternatives]
3. Data Requirements
| Data Type | Source | Volume | Quality Status | Bias Risk |
|---|---|---|---|---|
| [Training data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
| [Evaluation data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
Data gaps: [What's missing and plan to get it] Privacy considerations: [Any PII in training or inference data] Data ownership: [Do we own this data? Can we use it for training?]
4. Evaluation Framework
Primary metric: [The number that defines success — accuracy, F1, BLEU, user rating, task completion rate] Minimum acceptable threshold: [Below X, the feature does not ship] Human evaluation plan: [How will humans review model outputs? Sampling rate? Review panel?]
| Evaluation Type | Method | Cadence | Owner |
|---|---|---|---|
| Offline (pre-launch) | [Test set, benchmark] | Pre-launch | ML Lead |
| Online (post-launch) | [A/B test, user feedback] | Weekly | PM + ML |
| Adversarial | [Red-team, edge cases] | Pre-launch | Safety reviewer |
5. User Experience Design
How is AI output presented?
- Direct output shown to user (high trust required)
- AI-assisted with user confirmation
- Suggestion user can accept/reject
- Background action with audit log
Confidence and uncertainty handling:
- What happens when confidence is low? [Show alternative, ask for clarification, fallback to manual]
- How is uncertainty communicated to the user? [UI pattern]
Fallback plan:
- If the model fails or returns an error: [Specific fallback behaviour]
- If accuracy degrades below threshold: [Kill switch or graceful degradation plan]
6. Responsible AI Checklist
- Bias audit completed on training data
- Demographic fairness evaluated (does performance differ by user group?)
- Hallucination / confabulation risk assessed and mitigated
- User can see and correct AI output
- Opt-out mechanism exists (can user disable the AI feature?)
- Output provenance visible when relevant (does user know AI generated this?)
- PII not used in ways user didn't consent to
- Regulatory review completed (GDPR, AI Act, sector-specific)
- Model cards / documentation completed
7. Launch & Monitoring Plan
Rollout: [% of users, with staged expansion criteria] Monitoring metrics:
- Model performance: [Metric + alert threshold]
- User engagement with AI output: [Acceptance rate, override rate, feedback score]
- Error rate: [% of failed inferences]
- Latency: [P95 target]
Model refresh cadence: [How often is the model retrained or updated?] Drift detection: [How will you know when model performance degrades in production?]
Guidelines
- Never skip the "Why AI?" section — it's the most important question in AI product development
- The fallback UX is not optional — what happens when AI fails defines your product's trustworthiness
- Responsible AI checklist must be completed before launch, not after
- Include latency in success metrics — a 5-second AI response is often worse than no AI at all
- Recommend starting with a human-in-the-loop design and automating only when accuracy is proven
Required Inputs
Ask the user for these if not provided:
- Feature or product description (what the AI is intended to do)
- User problem (what problem the AI is solving for users)
- Available data (what training/inference data exists)
- ML/AI lead (who owns the technical implementation)
Quality Checks
- "Why AI?" is answered clearly (not "because we can")
- Minimum acceptable accuracy threshold is defined before build begins
- Fallback UX is specified for model failures or low-confidence outputs
- Responsible AI checklist is completed (not deferred to post-launch)
- Monitoring plan includes both model performance and user engagement metrics
More from mohitagw15856/pm-claude-skills
user-research-synthesis
Analyze and synthesize user research findings into structured, actionable insights. Use when given user research data, interview transcripts, survey results, or user feedback that needs to be analyzed and summarised. Produces a themed synthesis with prevalence data, supporting quotes, pain points analysis, feature request prioritisation, and recommended next steps.
26prd-template
Create a Product Requirements Document following proven PM template structure. Use when asked to write a PRD, product spec, feature specification, or requirements document for a new feature or product. Produces a complete PRD with problem statement, user stories, functional requirements, technical considerations, and success metrics.
20stakeholder-update
Create executive stakeholder updates following proven communication frameworks. Use when the user needs to create a status update, progress report, executive summary, or communication for leadership, stakeholders, or executives.
19competitive-analysis
Analyze competitors and create competitive landscape documentation with feature matrices, positioning maps, and strategic recommendations. Use when asked to analyze competitors, create competitive analysis, compare features with competitors, build a competitive landscape, track competitive positioning, or prepare sales battlecard inputs. Produces structured competitor profiles, feature comparison matrix, win/loss analysis, and prioritised strategic recommendations.
18meeting-notes
Structure and format meeting notes following PM best practices. Use when asked to create meeting notes, format discussion notes, capture action items, or document decisions from any meeting type. Produces structured notes with decisions, action items (owner + deadline), open questions, and next steps.
17executive-summary
Write an executive summary for any document, report, or proposal. Use when asked to write an executive summary, management summary, briefing paper, or one-pager for senior stakeholders. Produces a structured summary that busy executives can read in under 3 minutes and act on.
15