audit-design-system
Audit Design System
Review a Figma node for evidence that the design is not properly integrated with the design system.
This skill is read-only. When the user wants a write action afterward, downstream skills should use use_figma through a figma-use-style helper when the host environment requires one.
Output Format Selection
- Explicit user request wins:
- If the user asks for
--jsonor JSON, output raw JSON (no markdown fences, no prose). - If the user asks for
--markdown, markdown, or a specific human-readable format, output the human-readable markdown report.
- If the user asks for
- Codex Desktop app: Output raw JSON by default.
- Codex CLI and other chat-style environments: Output the human-readable markdown report by default.
- Machine-consumed review surfaces: Output raw JSON by default.
- Ambiguous environment: If the environment is unclear, output markdown by default.
Workflow
-
Parse the Figma input. Accept a full Figma URL, or a
fileKeyandnodeId. Normalize node IDs from72-293to72:293when needed. -
Pull the minimum required evidence with Figma MCP read tools. Call
get_design_contextfor the exact node under review. Callget_screenshotfor visual confirmation. Callget_variable_defsto see which variables are actually bound. Callget_code_connect_mapwhen relevant. Callget_metadatawhen the reviewed node is large, repeated, or board-like and you need to map nested instances before drilling in. Callsearch_design_systemwhen you have identified a likely non-systemized primitive and there is a realistic chance of suggesting a concrete replacement from the audited design system. -
Review for systemization failures, not visual taste. Look for places where the design should probably inherit from the design system but is locally constructed instead. Base every finding on structure visible in Figma: instances, duplicated frames, raw values, variant drift, or missing token bindings. Prefer omissions over weak findings.
-
When the evidence is strong enough, suggest a replacement candidate. After identifying a likely custom primitive, use
search_design_systemto find the closest matching component family from the audited design system. Include a candidate only when the match is credible from structure and naming, not just screenshot similarity. If search results are noisy or ambiguous, omit the candidate instead of guessing. -
Present findings in the appropriate format based on the environment. Use JSON for Codex Desktop and machine-consumed review surfaces, markdown for Claude Code CLI and other chat-style environments (see Output Format Selection above).
-
When the user wants a fix, route to the right downstream skill. Prefer fix-design-system-finding when one specific offending node should be repaired. Prefer apply-design-system when the user wants a broader screen-wide pass, multiple sections need coordinated remediation, or the review is being used to define scope before writing.
What To Flag
-
Shared UI primitives recreated as ad-hoc frames instead of component instances. Common targets: buttons, icon buttons, cards, alerts, pills, chips, avatars, stat tiles, tab bars, nav bars, FABs, list rows.
-
Repeated sibling structures that should clearly collapse into one reusable primitive. Example: three nearly identical stat tiles with different content.
-
Hard-coded visual values where the rest of the design system uses variables. Common targets: fills, strokes, text colors, radius, spacing, typography, shadows. Only flag this when the evidence is concrete, such as a raw hex value or bespoke geometry sitting beside tokenized peers.
-
Global navigation or other high-leverage patterns built from custom frames instead of system components. Flag these aggressively because drift there scales across many screens.
-
Variant drift inside a nominal component. Example: a local edit button with unusual size, stroke width, or radius that does not match the expected icon-button primitive.
What Not To Flag
- Purely aesthetic preferences.
- Copywriting or product decisions.
- Layout choices that can reasonably remain screen-specific.
- One-off compositions when the underlying primitives are already componentized and tokenized.
- Claims that require undocumented assumptions about a design library.
Evidence Standard
Every finding must answer both questions:
- What concrete Figma evidence shows this is not systemized correctly?
- Why does that matter for propagation, consistency, theming, or maintenance?
Good evidence includes:
- a node is a plain frame when it should be an instance
- several siblings duplicate the same structure
- raw color or geometry values appear where variables or standard primitives should apply
- a global pattern is custom-built
Weak evidence includes:
- "this looks custom"
- "I would normally make this a component"
- any statement based only on screenshot aesthetics without structural support
Replacement Suggestion Rule
When a finding is about a missing shared primitive, try to attach one likely replacement suggestion.
Use search_design_system after you already know what category of thing is missing, for example:
- custom avatar cluster
- bespoke stat tile
- local alert card
- hand-built navigation item
Only suggest a replacement when:
- the node's role is clear
- the search result belongs to the relevant library or audited file context
- the candidate is structurally plausible for the finding
Good suggestion language:
This custom avatar frame could likely be replaced with Avatar from library X.These repeated stat tiles appear to map to Metric item from library X.
Do not overstate:
- do not claim the suggested component is definitely correct unless the evidence is explicit
- do not force a replacement candidate into every finding
- do not recommend a component from an unrelated library just because search returned it first
Output Format
JSON Output
When the selected output format is JSON, return this exact JSON shape with no markdown fences and no extra prose:
{
"findings": [
{
"title": "<= 80 chars, imperative>",
"body": "<valid Markdown explaining why this is a problem>",
"confidence_score": 0.0,
"priority": 0,
"code_location": {
"absolute_file_path": "/figma/<fileKey>/nodes/<nodeId>",
"line_range": {
"start": 1,
"end": 1
}
}
}
],
"overall_correctness": "patch is correct" | "patch is incorrect",
"overall_explanation": "<1-3 sentence summary>",
"overall_confidence_score": 0.0
}
Schema notes:
- Use
overall_correctness: "patch is incorrect"whenever you found one or more design-system integration issues. - Use
overall_correctness: "patch is correct"only when there are no findings. - For each finding, set
code_location.absolute_file_pathto/figma/<fileKey>/nodes/<nodeId>using the most specific offending node. - Always set
line_range.startandline_range.endto1.
Human-Readable Markdown Report
When the selected output format is markdown, present a formatted markdown report with:
-
Header section:
- File name and node being reviewed
- Overall verdict: ✅ Passes / ⚠️ Needs Work / ❌ Significant Issues
- Confidence percentage
-
Summary: 2-3 sentences explaining the overall state
-
Findings table: Quick overview with priority indicators
- 🔴 Critical (priority 3): severe library-level or navigation-level issues
- 🟠 High (priority 2): important reusable primitive or tokenization issues
- 🟡 Medium (priority 1): moderate system drift
- ⚪ Low (priority 0): nits or low-impact consistency issues
-
Details section: Expand each finding with:
- What's wrong (concrete evidence from Figma structure)
- Why it matters (maintenance, consistency, theming impact)
- Likely replacement, when supported by
search_design_system - Affected node IDs for reference
-
Recommendations: Prioritized action items
Output Rules
- Keep findings focused on the highest-signal issues. Usually 0-6 findings.
- Keep titles imperative and under 80 characters.
- Always anchor each finding to a specific node ID so users can locate it in Figma.
- For JSON output, do not invent filesystem paths. Use
/figma/<fileKey>/nodes/<nodeId>exactly. - When a replacement suggestion is credible, include it in the finding body.
Review Heuristics
Use priority like this:
0: nit or low-impact consistency issue1: moderate system drift2: important reusable primitive or tokenization issue3: severe library-level or navigation-level issue likely to propagate widely
Use confidence_score like this:
0.9-1.0: direct structural evidence0.7-0.89: strong inference from repetition and nearby token usage0.5-0.69: plausible but incomplete evidence; prefer omitting instead
Board And Screen Scope
For a single screen:
- inspect the root node
- drill into repeated or high-leverage children
- anchor findings to the most specific offending node
For a board or larger page:
- use
get_metadatafirst to identify candidate screens or repeated modules - review only the most relevant nodes instead of trying to audit everything
- keep findings scoped and evidence-backed
Example Trigger Phrases
- "Review this Figma screen for design-system integration"
- "Audit this board for missing component usage"
- "Check whether this design uses tokens correctly"
- "/audit-design-system https://figma.com/design/..."
- "/audit-design-system --json https://figma.com/design/..." (for JSON output)
Handoff Guidance
Use this routing rule after the review:
- one concrete finding with a narrow write scope: use fix-design-system-finding
- several findings that collapse into a broader screen or section reconciliation pass: use apply-design-system
Do not force every review result through the single-finding fix skill. Some reviews are better used as scope discovery for a broader apply pass.