usability-frameworks
Usability Frameworks
Comprehensive frameworks and methodologies for planning, conducting, and analyzing usability tests to improve user experience.
When to Use This Skill
Auto-loaded by agents:
research-ops- For usability testing and heuristic evaluation
Use when you need:
- Planning usability tests
- Conducting user testing sessions
- Evaluating interface designs
- Identifying usability problems
- Testing prototypes or live products
- Applying Nielsen's heuristics
- Measuring usability metrics
Core Concepts
What is Usability Testing?
Usability testing is a method for evaluating a product by testing it with representative users. Users attempt to complete typical tasks while observers watch, listen, and take notes.
Purpose: Identify usability problems, discover opportunities for improvement, and learn about user behavior and preferences.
When to use:
- Before development (testing prototypes)
- During development (iterative testing)
- After launch (validation and optimization)
- Before major redesigns
The Five Usability Quality Components (Jakob Nielsen)
- Learnability: How easy is it for users to accomplish basic tasks the first time?
- Efficiency: How quickly can users perform tasks once they've learned the design?
- Memorability: Can users remember how to use it after time away?
- Errors: How many errors do users make, how severe, and how easily can they recover?
- Satisfaction: How pleasant is it to use the design?
Usability Testing Methodologies
1. Moderated Testing
Setup: Researcher guides participants through tasks in real-time Location: In-person or remote (video call)
Best for:
- Early-stage prototypes needing clarification
- Complex products requiring guidance
- Exploring "why" behind user behavior
- Uncovering emotional reactions
Process:
- Welcome and set expectations
- Pre-task questions (background, experience)
- Task scenarios with think-aloud protocol
- Post-task questions and discussion
- Wrap-up and thank you
Advantages:
- Rich qualitative insights
- Can probe deeper into issues
- Observe non-verbal cues
- Clarify misunderstandings immediately
Limitations:
- More time-intensive (30-60 min per session)
- Researcher bias possible
- Smaller sample sizes
- Scheduling logistics
2. Unmoderated Testing
Setup: Participants complete tasks independently, recorded for later review Location: Remote, on participant's own schedule
Best for:
- Mature products with clear tasks
- Large sample sizes needed
- Quick turnaround required
- Benchmarking and metrics
Process:
- Automated instructions and consent
- Participants record screen/audio while completing tasks
- Automated post-task surveys
- Researcher reviews recordings later
Advantages:
- Faster data collection
- Larger sample sizes
- More natural environment
- Lower cost per participant
Limitations:
- Can't probe or clarify
- May miss nuanced insights
- Technical issues harder to resolve
- Participants may skip think-aloud
3. Hybrid Approaches
Combination methods:
- Moderated first impressions + unmoderated task completion
- Unmoderated testing + follow-up interviews with interesting cases
- Moderated pilot + unmoderated scale testing
Nielsen's 10 Usability Heuristics
Quick reference for evaluating interfaces. See references/nielsens-10-heuristics.md for detailed explanations and examples.
- Visibility of system status - Keep users informed
- Match between system and real world - Speak users' language
- User control and freedom - Provide escape hatches
- Consistency and standards - Follow platform conventions
- Error prevention - Prevent problems before they occur
- Recognition rather than recall - Minimize memory load
- Flexibility and efficiency of use - Accelerators for experts
- Aesthetic and minimalist design - Remove irrelevant information
- Help users recognize, diagnose, and recover from errors - Plain language error messages
- Help and documentation - Provide when needed
Think-Aloud Protocol
What It Is
Participants verbalize their thoughts while completing tasks, providing real-time insight into their mental model.
Types
Concurrent think-aloud: Speak while performing tasks
- More natural thought flow
- May affect task performance slightly
Retrospective think-aloud: Review recording and explain thinking after
- Doesn't disrupt natural behavior
- May forget or rationalize thoughts
Facilitating Think-Aloud
Prompts to use:
- "What are you thinking right now?"
- "What are you looking for?"
- "What would you expect to happen?"
- "Is this what you expected?"
Don't:
- Ask leading questions
- Provide hints or solutions
- Interrupt natural flow too often
- Make participants feel tested
See references/think-aloud-protocol-guide.md for detailed facilitation techniques.
Task Scenario Design
Good task scenarios are critical to meaningful usability test results.
Characteristics of Good Task Scenarios
Realistic: Based on actual user goals Specific: Clear endpoint/success criteria Self-contained: Provide all necessary context Actionable: Clear starting point Not prescriptive: Don't tell them how to do it
Example Transformation
Poor: "Click on the 'My Account' link and change your password"
- Too prescriptive, tells them exactly where to click
Good: "You've heard about recent security breaches and want to make your account more secure. Update your account to use a stronger password."
- Realistic motivation, clear goal, doesn't prescribe path
Task Complexity Levels
Simple tasks (1-2 steps): Establish baseline usability Medium tasks (3-5 steps): Test core workflows Complex tasks (6+ steps): Evaluate overall experience and error recovery
See assets/task-scenario-template.md for ready-to-use templates.
Severity Rating Framework
Not all usability issues are equal. Prioritize fixes based on severity.
Three-Factor Severity Rating
Frequency: How often does this issue occur?
- High: > 50% of users encounter
- Medium: 10-50% encounter
- Low: < 10% encounter
Impact: When it occurs, how badly does it affect users?
- High: Prevents task completion / causes data loss
- Medium: Causes frustration or delays
- Low: Minor annoyance
Persistence: Do users overcome it with experience?
- High: Problem doesn't go away
- Medium: Users learn to avoid/work around
- Low: One-time problem only
Combined Severity Ratings
Critical (P0): High frequency + High impact Serious (P1): High frequency + Medium impact, OR Medium frequency + High impact Moderate (P2): High frequency + Low impact, OR Medium frequency + Medium impact, OR Low frequency + High impact Minor (P3): Everything else
See assets/severity-rating-guide.md for detailed rating criteria and examples.
Usability Metrics
Quantitative Metrics
Task Success Rate: % of participants who complete task successfully
- Binary: Did they complete it? (yes/no)
- Partial credit: Did they complete most of it?
Time on Task: How long to complete (for successful completions)
- Compare to baseline or competitor benchmarks
Error Rate: Number of errors per task
- Define what counts as an error for each task
Clicks/Taps to Task Completion: Efficiency measure
- More relevant for well-defined tasks
Standardized Questionnaires
SUS (System Usability Scale):
- 10 questions, 5-point Likert scale
- Score 0-100 (industry avg ~68)
- Quick, reliable, easy to administer
- Good for comparing versions or benchmarking
UMUX (Usability Metric for User Experience):
- 4 questions, lighter than SUS
- Similar reliability
- Faster for participants
SEQ (Single Ease Question):
- "Overall, how difficult or easy was the task to complete?" (1-7)
- One question per task
- Immediate subjective difficulty rating
Other scales:
- SUPR-Q (for websites)
- PSSUQ (post-study)
- NASA-TLX (cognitive load)
Qualitative Insights
Observed behaviors:
- Hesitations and confusion
- Error patterns
- Unexpected paths
- Verbal frustrations
Verbalized thoughts (think-aloud):
- Mental model mismatches
- Expectation violations
- Pleasantly surprising discoveries
Sample Size Guidelines
For Qualitative Insights
Nielsen's recommendation: 5 users finds ~85% of usability problems
- Diminishing returns after 5
- Run 3+ small rounds instead of 1 large round
- Iterate between rounds
Reality check:
- 5 is a minimum, not ideal
- Complex products may need 8-10
- Multiple user types need 5 each
For Quantitative Metrics
Benchmarking: 20+ users per user group A/B testing: Depends on effect size and desired confidence Statistical significance: Use power analysis calculators
Planning Your Usability Test
1. Define Objectives
What decisions will this research inform?
- Redesign priorities?
- Feature cut decisions?
- Success of recent changes?
2. Identify User Segments
Who needs to be tested?
- New vs. experienced users?
- Different roles or use cases?
- Different devices or contexts?
3. Select Tasks
What tasks represent success?
- Most critical user goals
- Most frequent tasks
- Recently changed features
- Known problem areas
4. Choose Methodology
Moderated, unmoderated, or hybrid?
- Consider timeline, budget, research questions
5. Create Test Script
See assets/usability-test-script-template.md for a ready-to-use structure including:
- Welcome and consent
- Background questions
- Task instructions
- Probing questions
- Wrap-up
6. Recruit Participants
- Define screening criteria
- Aim for 5-10 per user segment
- Plan for no-shows (recruit 20% extra)
- Offer appropriate incentives
7. Conduct Pilot Test
- Test with colleague or friend
- Validate timing
- Check recording setup
- Refine unclear tasks
8. Run Sessions
- Stay neutral and encouraging
- Observe without interfering
- Take detailed notes
- Record if permitted
9. Analyze and Synthesize
- Code issues by severity
- Identify patterns across participants
- Link issues to heuristics violated
- Quantify task success and time
10. Report and Recommend
- Prioritized issue list
- Video clips of critical issues
- Recommendations with rationale
- Quick wins vs. strategic fixes
Integration with Product Development
When to Test
Discovery phase: Test competitors or analogous products Concept phase: Test paper prototypes or wireframes Design phase: Test high-fidelity mockups Development phase: Test working builds iteratively Pre-launch: Validate before release Post-launch: Identify optimization opportunities
Continuous Usability Testing
Build it into your process:
- Weekly or bi-weekly test sessions
- Rotating focus (new features, established flows, mobile vs. desktop)
- Standing recruiting panel
- Lightweight reporting to team
Ready-to-Use Templates
We provide templates to accelerate your usability testing:
In assets/:
- usability-test-script-template.md: Complete moderator script structure
- task-scenario-template.md: Framework for creating effective task scenarios
- severity-rating-guide.md: Detailed criteria for rating usability issues
In references/:
- nielsens-10-heuristics.md: Deep dive into each heuristic with examples
- think-aloud-protocol-guide.md: Advanced facilitation techniques and troubleshooting
Common Pitfalls to Avoid
- Leading participants: "Was that easy?" → "How would you describe that experience?"
- Testing the wrong tasks: Tasks that aren't real user goals
- Over-explaining: Let users struggle and discover issues naturally
- Ignoring severity: Fixing cosmetic issues while critical issues remain
- Testing too late: After it's expensive to change
- Not iterating: One-and-done testing instead of continuous improvement
- Confusing usability with preference: "I like green" ≠ usability issue
- Sample bias: Testing only power users or only complete novices
Further Learning
Books:
- "Rocket Surgery Made Easy" by Steve Krug
- "Handbook of Usability Testing" by Jeffrey Rubin
- "Moderating Usability Tests" by Joseph Dumas
Online resources:
- Nielsen Norman Group articles
- Usability.gov
- Baymard Institute research
Troubleshooting
"Participants aren't finding real issues": Your tasks are too easy or too guided. Write tasks as goals ("Find a flight under $300 to Chicago"), not instructions ("Click the search button, then enter Chicago"). Let participants struggle -- that's where the insights are.
"Stakeholders dismiss usability findings": Quantify severity. Use Nielsen's severity scale (0-4) and pair each finding with task success rate data. "3 of 5 users failed to complete checkout" is harder to dismiss than "checkout is confusing."
"We only have 3 users available for testing": That's enough for a guerrilla test. Nielsen's research shows 5 users find ~85% of issues, but 3 users still find the majority. Run the test, fix the top issues, test again with 3 more.
Related Skills
interview-frameworks- User interview techniques for deeper qualitative researchsynthesis-frameworks- Synthesizing usability findings into actionable insightsuser-research-techniques- Broader research methods beyond usability testingvalidation-frameworks- Validating solutions after usability improvements
More from slgoodrich/agents
prd-templates
Master PRD templates including problem statements, success metrics, requirements, user stories, and technical considerations. Use when writing PRDs, documenting features, defining requirements, communicating product decisions, or creating feature specifications. Covers PRD structure, writing best practices, and templates from Amazon, Google, and high-performing PM teams.
26user-story-templates
User story templates with acceptance criteria, story splitting, and INVEST criteria. Use when writing user stories, defining acceptance criteria, splitting large stories, or refining a backlog for sprint planning. Trigger on: 'write user stories for this feature', 'acceptance criteria', 'Given-When-Then', 'split this epic into stories', 'INVEST criteria'.
13validation-frameworks
Problem and solution validation methodologies, assumption testing, and MVP experiments. Use when validating a problem worth solving, testing solution assumptions, designing MVP experiments, or deciding whether to pivot or persevere. Trigger on: 'validate my idea', 'test my assumptions', 'design an MVP experiment', 'should I pivot', 'is this problem worth solving'.
11market-sizing-frameworks
TAM/SAM/SOM calculations and market sizing methodologies. Use when assessing market opportunity, estimating revenue potential, or validating if a market is worth pursuing. Trigger on: 'size my market', 'TAM SAM SOM', 'how big is my market', 'market opportunity', 'is this market worth it'.
10product-positioning
Product positioning and differentiation using April Dunford's framework. Use when positioning a product, defining competitive differentiation, developing messaging, or entering a new market. Trigger on: 'position my product', 'how do I differentiate', 'positioning statement', 'competitive alternatives', 'what category should I be in'.
10roadmap-frameworks
Product roadmap frameworks including Now-Next-Later, outcome-based, and timeline formats. Use when creating a product roadmap, presenting strategy to executives or customers, planning quarterly initiatives, or choosing a roadmap format for your audience. Trigger on: 'create a product roadmap', 'what roadmap format should I use', 'plan next quarter', 'roadmap for executives vs engineering', 'now next later template'.
10