skills/slgoodrich/agents/usability-frameworks

usability-frameworks

SKILL.md

Usability Frameworks

Comprehensive frameworks and methodologies for planning, conducting, and analyzing usability tests to improve user experience.

When to Use This Skill

Auto-loaded by agents:

  • research-ops - For usability testing and heuristic evaluation

Use when you need:

  • Planning usability tests
  • Conducting user testing sessions
  • Evaluating interface designs
  • Identifying usability problems
  • Testing prototypes or live products
  • Applying Nielsen's heuristics
  • Measuring usability metrics

Core Concepts

What is Usability Testing?

Usability testing is a method for evaluating a product by testing it with representative users. Users attempt to complete typical tasks while observers watch, listen, and take notes.

Purpose: Identify usability problems, discover opportunities for improvement, and learn about user behavior and preferences.

When to use:

  • Before development (testing prototypes)
  • During development (iterative testing)
  • After launch (validation and optimization)
  • Before major redesigns

The Five Usability Quality Components (Jakob Nielsen)

  1. Learnability: How easy is it for users to accomplish basic tasks the first time?
  2. Efficiency: How quickly can users perform tasks once they've learned the design?
  3. Memorability: Can users remember how to use it after time away?
  4. Errors: How many errors do users make, how severe, and how easily can they recover?
  5. Satisfaction: How pleasant is it to use the design?

Usability Testing Methodologies

1. Moderated Testing

Setup: Researcher guides participants through tasks in real-time Location: In-person or remote (video call)

Best for:

  • Early-stage prototypes needing clarification
  • Complex products requiring guidance
  • Exploring "why" behind user behavior
  • Uncovering emotional reactions

Process:

  1. Welcome and set expectations
  2. Pre-task questions (background, experience)
  3. Task scenarios with think-aloud protocol
  4. Post-task questions and discussion
  5. Wrap-up and thank you

Advantages:

  • Rich qualitative insights
  • Can probe deeper into issues
  • Observe non-verbal cues
  • Clarify misunderstandings immediately

Limitations:

  • More time-intensive (30-60 min per session)
  • Researcher bias possible
  • Smaller sample sizes
  • Scheduling logistics

2. Unmoderated Testing

Setup: Participants complete tasks independently, recorded for later review Location: Remote, on participant's own schedule

Best for:

  • Mature products with clear tasks
  • Large sample sizes needed
  • Quick turnaround required
  • Benchmarking and metrics

Process:

  1. Automated instructions and consent
  2. Participants record screen/audio while completing tasks
  3. Automated post-task surveys
  4. Researcher reviews recordings later

Advantages:

  • Faster data collection
  • Larger sample sizes
  • More natural environment
  • Lower cost per participant

Limitations:

  • Can't probe or clarify
  • May miss nuanced insights
  • Technical issues harder to resolve
  • Participants may skip think-aloud

3. Hybrid Approaches

Combination methods:

  • Moderated first impressions + unmoderated task completion
  • Unmoderated testing + follow-up interviews with interesting cases
  • Moderated pilot + unmoderated scale testing

Nielsen's 10 Usability Heuristics

Quick reference for evaluating interfaces. See references/nielsens-10-heuristics.md for detailed explanations and examples.

  1. Visibility of system status - Keep users informed
  2. Match between system and real world - Speak users' language
  3. User control and freedom - Provide escape hatches
  4. Consistency and standards - Follow platform conventions
  5. Error prevention - Prevent problems before they occur
  6. Recognition rather than recall - Minimize memory load
  7. Flexibility and efficiency of use - Accelerators for experts
  8. Aesthetic and minimalist design - Remove irrelevant information
  9. Help users recognize, diagnose, and recover from errors - Plain language error messages
  10. Help and documentation - Provide when needed

Think-Aloud Protocol

What It Is

Participants verbalize their thoughts while completing tasks, providing real-time insight into their mental model.

Types

Concurrent think-aloud: Speak while performing tasks

  • More natural thought flow
  • May affect task performance slightly

Retrospective think-aloud: Review recording and explain thinking after

  • Doesn't disrupt natural behavior
  • May forget or rationalize thoughts

Facilitating Think-Aloud

Prompts to use:

  • "What are you thinking right now?"
  • "What are you looking for?"
  • "What would you expect to happen?"
  • "Is this what you expected?"

Don't:

  • Ask leading questions
  • Provide hints or solutions
  • Interrupt natural flow too often
  • Make participants feel tested

See references/think-aloud-protocol-guide.md for detailed facilitation techniques.

Task Scenario Design

Good task scenarios are critical to meaningful usability test results.

Characteristics of Good Task Scenarios

Realistic: Based on actual user goals Specific: Clear endpoint/success criteria Self-contained: Provide all necessary context Actionable: Clear starting point Not prescriptive: Don't tell them how to do it

Example Transformation

Poor: "Click on the 'My Account' link and change your password"

  • Too prescriptive, tells them exactly where to click

Good: "You've heard about recent security breaches and want to make your account more secure. Update your account to use a stronger password."

  • Realistic motivation, clear goal, doesn't prescribe path

Task Complexity Levels

Simple tasks (1-2 steps): Establish baseline usability Medium tasks (3-5 steps): Test core workflows Complex tasks (6+ steps): Evaluate overall experience and error recovery

See assets/task-scenario-template.md for ready-to-use templates.

Severity Rating Framework

Not all usability issues are equal. Prioritize fixes based on severity.

Three-Factor Severity Rating

Frequency: How often does this issue occur?

  • High: > 50% of users encounter
  • Medium: 10-50% encounter
  • Low: < 10% encounter

Impact: When it occurs, how badly does it affect users?

  • High: Prevents task completion / causes data loss
  • Medium: Causes frustration or delays
  • Low: Minor annoyance

Persistence: Do users overcome it with experience?

  • High: Problem doesn't go away
  • Medium: Users learn to avoid/work around
  • Low: One-time problem only

Combined Severity Ratings

Critical (P0): High frequency + High impact Serious (P1): High frequency + Medium impact, OR Medium frequency + High impact Moderate (P2): High frequency + Low impact, OR Medium frequency + Medium impact, OR Low frequency + High impact Minor (P3): Everything else

See assets/severity-rating-guide.md for detailed rating criteria and examples.

Usability Metrics

Quantitative Metrics

Task Success Rate: % of participants who complete task successfully

  • Binary: Did they complete it? (yes/no)
  • Partial credit: Did they complete most of it?

Time on Task: How long to complete (for successful completions)

  • Compare to baseline or competitor benchmarks

Error Rate: Number of errors per task

  • Define what counts as an error for each task

Clicks/Taps to Task Completion: Efficiency measure

  • More relevant for well-defined tasks

Standardized Questionnaires

SUS (System Usability Scale):

  • 10 questions, 5-point Likert scale
  • Score 0-100 (industry avg ~68)
  • Quick, reliable, easy to administer
  • Good for comparing versions or benchmarking

UMUX (Usability Metric for User Experience):

  • 4 questions, lighter than SUS
  • Similar reliability
  • Faster for participants

SEQ (Single Ease Question):

  • "Overall, how difficult or easy was the task to complete?" (1-7)
  • One question per task
  • Immediate subjective difficulty rating

Other scales:

  • SUPR-Q (for websites)
  • PSSUQ (post-study)
  • NASA-TLX (cognitive load)

Qualitative Insights

Observed behaviors:

  • Hesitations and confusion
  • Error patterns
  • Unexpected paths
  • Verbal frustrations

Verbalized thoughts (think-aloud):

  • Mental model mismatches
  • Expectation violations
  • Pleasantly surprising discoveries

Sample Size Guidelines

For Qualitative Insights

Nielsen's recommendation: 5 users finds ~85% of usability problems

  • Diminishing returns after 5
  • Run 3+ small rounds instead of 1 large round
  • Iterate between rounds

Reality check:

  • 5 is a minimum, not ideal
  • Complex products may need 8-10
  • Multiple user types need 5 each

For Quantitative Metrics

Benchmarking: 20+ users per user group A/B testing: Depends on effect size and desired confidence Statistical significance: Use power analysis calculators

Planning Your Usability Test

1. Define Objectives

What decisions will this research inform?

  • Redesign priorities?
  • Feature cut decisions?
  • Success of recent changes?

2. Identify User Segments

Who needs to be tested?

  • New vs. experienced users?
  • Different roles or use cases?
  • Different devices or contexts?

3. Select Tasks

What tasks represent success?

  • Most critical user goals
  • Most frequent tasks
  • Recently changed features
  • Known problem areas

4. Choose Methodology

Moderated, unmoderated, or hybrid?

  • Consider timeline, budget, research questions

5. Create Test Script

See assets/usability-test-script-template.md for a ready-to-use structure including:

  • Welcome and consent
  • Background questions
  • Task instructions
  • Probing questions
  • Wrap-up

6. Recruit Participants

  • Define screening criteria
  • Aim for 5-10 per user segment
  • Plan for no-shows (recruit 20% extra)
  • Offer appropriate incentives

7. Conduct Pilot Test

  • Test with colleague or friend
  • Validate timing
  • Check recording setup
  • Refine unclear tasks

8. Run Sessions

  • Stay neutral and encouraging
  • Observe without interfering
  • Take detailed notes
  • Record if permitted

9. Analyze and Synthesize

  • Code issues by severity
  • Identify patterns across participants
  • Link issues to heuristics violated
  • Quantify task success and time

10. Report and Recommend

  • Prioritized issue list
  • Video clips of critical issues
  • Recommendations with rationale
  • Quick wins vs. strategic fixes

Integration with Product Development

When to Test

Discovery phase: Test competitors or analogous products Concept phase: Test paper prototypes or wireframes Design phase: Test high-fidelity mockups Development phase: Test working builds iteratively Pre-launch: Validate before release Post-launch: Identify optimization opportunities

Continuous Usability Testing

Build it into your process:

  • Weekly or bi-weekly test sessions
  • Rotating focus (new features, established flows, mobile vs. desktop)
  • Standing recruiting panel
  • Lightweight reporting to team

Ready-to-Use Templates

We provide templates to accelerate your usability testing:

In assets/:

  • usability-test-script-template.md: Complete moderator script structure
  • task-scenario-template.md: Framework for creating effective task scenarios
  • severity-rating-guide.md: Detailed criteria for rating usability issues

In references/:

  • nielsens-10-heuristics.md: Deep dive into each heuristic with examples
  • think-aloud-protocol-guide.md: Advanced facilitation techniques and troubleshooting

Common Pitfalls to Avoid

  1. Leading participants: "Was that easy?" → "How would you describe that experience?"
  2. Testing the wrong tasks: Tasks that aren't real user goals
  3. Over-explaining: Let users struggle and discover issues naturally
  4. Ignoring severity: Fixing cosmetic issues while critical issues remain
  5. Testing too late: After it's expensive to change
  6. Not iterating: One-and-done testing instead of continuous improvement
  7. Confusing usability with preference: "I like green" ≠ usability issue
  8. Sample bias: Testing only power users or only complete novices

Further Learning

Books:

  • "Rocket Surgery Made Easy" by Steve Krug
  • "Handbook of Usability Testing" by Jeffrey Rubin
  • "Moderating Usability Tests" by Joseph Dumas

Online resources:

  • Nielsen Norman Group articles
  • Usability.gov
  • Baymard Institute research

This skill provides the foundation for conducting effective usability testing. Use the templates in assets/ for quick starts and references/ for deeper dives into specific techniques.

Weekly Installs
5
GitHub Stars
54
First Seen
Feb 26, 2026
Installed on
gemini-cli5
claude-code5
github-copilot5
codex5
amp5
kimi-cli5