Human Taste

Evaluate UX and product design through human taste -- the trained judgment that detects whether a design reduces cognitive friction, feels coherent, and fits its audience.

This skill is grounded in research from cognitive psychology, HCI, and design practice. For full citations see references/research-sources.md.

Why This Matters

LLMs can generate designs, but aesthetic judgment involves empathy, cultural awareness, and pattern recognition that require human-calibrated evaluation. Research shows:

Users form aesthetic impressions within milliseconds (eye-tracking studies)
Interfaces that reduce cognitive load are perceived as more beautiful (Processing Fluency Theory)
Taste develops through repeated exposure and operates at a pre-conscious perceptual level
Good taste means choosing simplicity over mere familiarity (Hickey's Simple vs Easy)

This skill provides a structured protocol so agents can approximate that judgment systematically.

Quick Start

When asked to evaluate a design:

Identify what you are evaluating -- screenshot, wireframe, live page, component, or described flow
Run the rubric below across all six dimensions
Produce a Human Taste Report using the output template
Cite specific elements -- never give vague praise or criticism

Evaluation Rubric

Score each dimension 1-5. Anchor your score with concrete evidence from the design.

1. Cognitive Load (weight: high)

Does the design minimize unnecessary mental effort?

Score	Meaning
1	Overwhelming -- too many competing elements, no clear entry point
2	Heavy -- user must work to understand the hierarchy
3	Moderate -- some unnecessary complexity but functional
4	Light -- clear hierarchy, minimal distractions
5	Effortless -- information is exactly where you expect it

Look for: element count per view, competing focal points, label clarity, progressive disclosure, information grouping.

2. Visual Coherence (weight: high)

Does the design feel unified rather than assembled from parts?

Score	Meaning
1	Fragmented -- inconsistent spacing, colors, typography
2	Patchy -- some consistency but noticeable breaks
3	Adequate -- follows a system with minor deviations
4	Cohesive -- strong visual rhythm, clear design system
5	Seamless -- every element reinforces the whole

Look for: spacing consistency, color palette discipline, typographic scale, alignment grid, icon style unity.

3. Interaction Clarity (weight: high)

Can a user predict what happens next at every step?

Score	Meaning
1	Opaque -- controls are ambiguous, outcomes unclear
2	Confusing -- some actions have surprising results
3	Functional -- most flows are predictable
4	Clear -- affordances are obvious, feedback is immediate
5	Intuitive -- zero learning curve, flows feel inevitable

Look for: button labels, hover/focus states, loading indicators, error messages, navigation predictability, undo availability.

4. Context Fit (weight: medium)

Does the design match its audience and environment?

Score	Meaning
1	Mismatch -- tone, density, or style wrong for the audience
2	Off -- partially appropriate but feels generic
3	Acceptable -- reasonable for the context
4	Tailored -- shows awareness of user needs and setting
5	Perfect fit -- feels like it was made for exactly this audience

Look for: reading level, information density vs audience expertise, platform conventions, accessibility, cultural appropriateness.

5. Restraint (weight: medium)

Does the design know what to leave out?

Score	Meaning
1	Bloated -- every feature is visible, nothing is prioritized
2	Cluttered -- too many options competing for attention
3	Balanced -- reasonable feature surface
4	Disciplined -- clear priorities, secondary items recede
5	Minimal -- only the essential, nothing to remove

Look for: feature density, progressive disclosure, empty states, whitespace usage, hidden-by-default patterns.

6. Emotional Response (weight: low)

Does the design evoke the intended feeling?

Score	Meaning
1	Repellent -- actively unpleasant
2	Flat -- no emotional register
3	Neutral -- inoffensive
4	Warm -- creates mild positive engagement
5	Delightful -- memorable, evokes trust or joy

Look for: micro-interactions, illustration style, copy tone, color warmth, motion design, personality.

Output Template

Produce your evaluation in this format:

# Human Taste Report

**Subject:** [what was evaluated]
**Date:** [date]
**Overall Score:** [weighted average, 1-5, one decimal] / 5

## Scores

| Dimension | Score | Key Evidence |
|-----------|-------|-------------|
| Cognitive Load | X/5 | [specific observation] |
| Visual Coherence | X/5 | [specific observation] |
| Interaction Clarity | X/5 | [specific observation] |
| Context Fit | X/5 | [specific observation] |
| Restraint | X/5 | [specific observation] |
| Emotional Response | X/5 | [specific observation] |

## Strengths
- [concrete strength with evidence]
- [concrete strength with evidence]

## Issues
- **[severity: Critical/Major/Minor]**: [specific issue] -- [why it matters] -- [suggested fix]

## Verdict
[2-3 sentence summary: what works, what does not, and the single highest-impact improvement]

Weighted average formula: (CognitiveLoad*3 + VisualCoherence*3 + InteractionClarity*3 + ContextFit*2 + Restraint*2 + EmotionalResponse*1) / 14

Comparing Alternatives

When comparing two or more designs:

Run the rubric on each independently
Add a Comparison Table showing side-by-side scores
Declare a winner per dimension and overall
Explain the tradeoffs -- a lower-scoring design may still be right for a specific audience

Reviewing AI-Generated Designs

AI-generated UI often has specific taste failure modes:

Over-decoration -- gradients, shadows, and effects without purpose
Generic composition -- layouts that feel template-driven rather than content-driven
Inconsistent density -- mixing spacious and cramped sections
Missing edge states -- empty states, error states, loading states not considered
Surface polish without structural clarity -- looks good at first glance but confusing to use

Flag these explicitly when you detect them.

When Not to Use This Skill

Pure backend/API design with no user-facing component
Code review for logic correctness (use a code-review skill instead)
Accessibility audits (this skill covers taste, not WCAG compliance -- though the two overlap)

Additional Resources

For full research citations and sources, see references/research-sources.md
For worked examples of the rubric in action, see examples.md

human-taste