audit
MANDATORY PREPARATION
Users start this workflow with /audit. Once this skill is active, load $frontend-design — it contains design principles, anti-patterns, and the Context Gathering Protocol. Follow that protocol before proceeding — if no design context exists yet, you MUST load $setup first.
Run systematic technical quality checks and generate a comprehensive report. Don't fix issues — document them for other commands to address.
This is a code-level audit, not a design critique. Check what's measurable and verifiable in the implementation.
Consult the hierarchy checklist when reviewing grayscale hierarchy, action priority, section-title restraint, and label/value treatment. Consult the ai slop detection when checking for generic trend-driven anti-patterns. Consult the action hierarchy when reviewing primary, secondary, tertiary, and destructive actions. Consult the semantic color when checking whether color is communicating state or just decoration. Consult the surface separation when checking whether borders, shadows, cards, overlap, and background shifts are being used intentionally. Consult the image treatment when screenshots, icons, or media handling affect usability or polish.
Still, when the implementation clearly violates the shared design system or obvious Refactoring UI principles — weak hierarchy, arbitrary spacing, gray text on color, every button styled as primary — call it out as an implementation issue, not a matter of taste.
Diagnostic Scan
Run comprehensive checks across 5 dimensions. Score each dimension 0-4 using the criteria below.
1. Accessibility (A11y)
Check for:
- Contrast issues: Text contrast ratios < 4.5:1 (or 7:1 for AAA)
- Missing ARIA: Interactive elements without proper roles, labels, or states
- Keyboard navigation: Missing focus indicators, illogical tab order, keyboard traps
- Semantic HTML: Improper heading hierarchy, missing landmarks, divs instead of buttons
- Alt text: Missing or poor image descriptions
- Form issues: Inputs without labels, poor error messaging, missing required indicators
Score 0-4: 0=Inaccessible (fails WCAG A), 1=Major gaps (few ARIA labels, no keyboard nav), 2=Partial (some a11y effort, significant gaps), 3=Good (WCAG AA mostly met, minor gaps), 4=Excellent (WCAG AA fully met, approaches AAA)
2. Performance
Check for:
- Layout thrashing: Reading/writing layout properties in loops
- Expensive animations: Animating layout properties (width, height, top, left) instead of transform/opacity
- Interaction latency: Common actions provide no immediate feedback, rely on blank waits, or miss obvious optimistic/prefetch/progressive-loading opportunities
- Missing optimization: Images without lazy loading, unoptimized assets, missing will-change
- Bundle size: Unnecessary imports, unused dependencies
- Render performance: Unnecessary re-renders, missing memoization
Score 0-4: 0=Severe issues (layout thrash, unoptimized everything), 1=Major problems (no lazy loading, expensive animations), 2=Partial (some optimization, gaps remain), 3=Good (mostly optimized, minor improvements possible), 4=Excellent (fast, lean, well-optimized)
3. Theming
Check for:
- Hard-coded colors: Colors not using design tokens
- Broken dark mode: Missing dark mode variants, poor contrast in dark theme
- Inconsistent tokens: Using wrong tokens, mixing token types
- Theme switching issues: Values that don't update on theme change
- Missing color ramps: Ad-hoc one-off shades instead of a defined palette
- Too many shades without system: Slightly different blues/greys everywhere with no clear ramp
Score 0-4: 0=No theming (hard-coded everything), 1=Minimal tokens (mostly hard-coded), 2=Partial (tokens exist but inconsistently used), 3=Good (tokens used, minor hard-coded values), 4=Excellent (full token system, dark mode works perfectly)
4. Responsive Design
Check for:
- Fixed widths: Hard-coded widths that break on narrow viewports
- Touch targets: Interactive elements < 44x44px
- Horizontal scroll: Content overflow on narrow viewports
- Text scaling: Layouts that break when text size increases
- Missing breakpoints: No narrow/medium viewport variants
Score 0-4: 0=Wide-layout-only (breaks on narrow viewports), 1=Major issues (some breakpoints, many failures), 2=Partial (works across viewports, rough edges), 3=Good (responsive, minor target or overflow issues), 4=Excellent (fluid, all viewports, proper target sizing)
5. Anti-Patterns (CRITICAL)
Check against ALL the DON'T guidelines in the frontend-design skill and the ai slop detection reference. Look for AI slop tells (AI color palette, gradient text, glassmorphism, hero metrics, card grids, generic fonts) and general design anti-patterns (gray on color, nested cards, bounce easing, redundant copy).
Also run these implementation-level hierarchy checks:
- Hierarchy check: Can a user identify primary, secondary, and tertiary elements within about 2 seconds?
- System check: Do spacing, type, color, radius, and elevation appear to come from systems rather than arbitrary values?
- Action check: Is there one obvious primary action, with quieter secondary and tertiary actions?
- Border check: Are borders doing work that spacing, contrast, or background shifts should handle instead?
- Label:value anti-pattern test: Is data forced into rigid label/value formatting even when values or context already explain themselves?
- Unnecessary colored-link emphasis test: Are links colored so loudly that they compete with higher-priority actions or content?
- Grey-on-color fail test: Is secondary text on colored surfaces washed out because it uses generic grey instead of a related shade?
- Ambiguous grouping test: Are groups separated less clearly than the items inside them?
- Too-many-font-sizes test: Has typography drifted into a pile of near-identical sizes without a clear scale?
- Section title semantic/visual mismatch test: Are headings visually oversized just because their semantic level sounds important?
- Right-aligned-numbers test: Are numeric columns left-aligned where comparison would benefit from right alignment?
- Section title too loud test: Are section headings stealing attention from the content they label?
- Line-length / line-height test: Are paragraphs too wide for their line-height, or too tightly spaced for comfortable reading?
- Action hierarchy flattening test: Are multiple actions styled with the same urgency when only one should lead?
- Scaled-down screenshot legibility test: Are screenshots too small to communicate useful detail or structure?
- Scaled-up icon chunkiness test: Are tiny icons enlarged far past their intended size?
- Surface separation test: Are borders, shadows, cards, and background shifts stacked redundantly instead of chosen intentionally?
- Overlap clash test: Are overlapping images/cards colliding without clean separation or readable layering?
- Dark-pattern check: Are there misleading labels, preselected exploitative options, obstructed cancellation/consent flows, fake urgency, or hierarchy that pressures the wrong choice?
- Guardrail check: Do bulk/destructive/admin/powerful actions lack confirmations, undo, permission boundaries, or other proportional safeguards?
Score 0-4: 0=AI slop gallery (5+ tells), 1=Heavy AI aesthetic (3-4 tells), 2=Some tells (1-2 noticeable), 3=Mostly clean (subtle issues only), 4=No AI tells (distinctive, intentional design)
Generate Report
Audit Health Score
| # | Dimension | Score | Key Finding |
|---|---|---|---|
| 1 | Accessibility | ? | [most critical a11y issue or "--"] |
| 2 | Performance | ? | |
| 3 | Responsive Design | ? | |
| 4 | Theming | ? | |
| 5 | Anti-Patterns | ? | |
| Total | ??/20 | [Rating band] |
Rating bands: 18-20 Excellent (minor polish), 14-17 Good (address weak dimensions), 10-13 Acceptable (significant work needed), 6-9 Poor (major overhaul), 0-5 Critical (fundamental issues)
Anti-Patterns Verdict
Start here. Pass/fail: Does this look AI-generated? List specific tells. Be brutally honest.
Include a one-line hierarchy verdict as part of this section: either the screen has clear hierarchy, or the main ambiguity you observed.
Executive Summary
- Audit Health Score: ??/20 ([rating band])
- Total issues found (count by severity: P0/P1/P2/P3)
- Top 3-5 critical issues
- Recommended next steps
Detailed Findings by Severity
Tag every issue with P0-P3 severity:
- P0 Blocking: Prevents task completion — fix immediately
- P1 Major: Significant difficulty or WCAG AA violation — fix before release
- P2 Minor: Annoyance, workaround exists — fix in next pass
- P3 Polish: Nice-to-fix, no real user impact — fix if time permits
For each issue, document:
- [P?] Issue name
- Location: Component, file, line
- Category: Accessibility / Performance / Theming / Responsive / Anti-Pattern
- Impact: How it affects users
- WCAG/Standard: Which standard it violates (if applicable)
- Recommendation: How to fix it
- Suggested command: Which command to use (prefer: /animate, /arrange, /critique, /extract, /polish, /optimize, /audit, /typeset, /bolder, /clarify, /delight, /adapt, /colorize, /quieter, /harden, /distill, /onboard, /normalize, /showcase)
Patterns & Systemic Issues
Identify recurring problems that indicate systemic gaps rather than one-off mistakes:
- "Hard-coded colors appear in 15+ components, should use design tokens"
- "Interactive targets consistently too small (<44px) throughout narrow-layout flows"
- "Spacing values appear arbitrary; groups and sections don't use a recognizable rhythm"
- "Multiple actions are styled as primary, flattening decision-making"
- "Section titles are consistently louder than the content they introduce"
- "Paragraph widths and line-heights are mismatched across text-heavy screens"
Positive Findings
Note what's working well — good practices to maintain and replicate.
Recommended Actions
List recommended commands in priority order (P0 first, then P1, then P2):
- [P?]
/command-name— Brief description (specific context from audit findings) - [P?]
/command-name— Brief description (specific context)
Rules: Only recommend commands from: /animate, /arrange, /critique, /extract, /polish, /optimize, /audit, /typeset, /bolder, /clarify, /delight, /adapt, /colorize, /quieter, /harden, /distill, /onboard, /normalize, /showcase. Map findings to the most appropriate command. End with /polish as the final step if any fixes were recommended.
After presenting the summary, tell the user:
You can ask me to run these one at a time, all at once, or in any order you prefer.
Re-run
/auditafter fixes to see your score improve.
IMPORTANT: Be thorough but actionable. Too many P3 issues creates noise. Focus on what actually matters.
NEVER:
- Report issues without explaining impact (why does this matter?)
- Provide generic recommendations (be specific and actionable)
- Skip positive findings (celebrate what works)
- Forget to prioritize (everything can't be P0)
- Report false positives without verification
- Treat a clear hierarchy failure as subjective fluff — if users can't tell what matters, that is an implementation problem
Remember: You're a technical quality auditor. Document systematically, prioritize ruthlessly, cite specific code locations, and provide clear paths to improvement.