visual-debug

Installation
SKILL.md

Visual Debug

Automated visual comparison — original vs implementation. Zero vision tokens via AE/SSIM CLI tools.

When to use

  • After implementing a section, before declaring "done"
  • When user says "it's different", "doesn't match"
  • During ui-reverse-engineering Phase C
  • Instead of Read-ing screenshots for comparison

HARD RULE: Never Read ref/impl images for comparison. Only read DIFF images for FAIL positions. Exception: Phase E reads ref+impl pairs.

Token rule

Pipe large eval output to a file, then Read only what you need:

agent-browser --session <s> eval "<script>" > tmp/ref/<name>.json

Never let large JSON print to stdout — it wastes tokens.

Dependencies

brew install imagemagick ffmpeg dssim
which agent-browser

Scripts

SCRIPTS_DIR="${CLAUDE_PLUGIN_ROOT:+$CLAUDE_PLUGIN_ROOT/skills/visual-debug/scripts}"
SCRIPTS_DIR="${SCRIPTS_DIR:-$(find ~/.claude/skills -name 'ae-compare.sh' -exec dirname {} \; 2>/dev/null | head -1)}"
Script Purpose
batch-scroll.sh <orig> <impl> <session> [dir] Captures both at 0–100% scroll positions
ae-compare.sh <ref.png> <impl.png> [diff.png] AE comparison → `AE= STATUS=PASS
batch-compare.sh <dir> [threshold] Compare all pairs. Supports dynamic thresholds
dssim-compare.sh <dir> [threshold] Structural similarity (catches what AE misses)
layout-diff.sh <session> <orig> <impl> Section bounding box comparison
computed-diff.sh <session> <orig> <impl> <sel...> getComputedStyle comparison
section-compare.sh <orig> <impl> <session> [dir] Section-level comparison — crops each section, AE + structure diff. Catches SVG-as-text, layout mismatches
transition-compare.sh <orig> <impl> <session> [dir] Transition comparison — idle/hover screenshots + computedStyle + timing diff per element

Workflow

Full-page comparison (broad sweep)

1. Capture    batch-scroll.sh <orig> <impl> <session>
2. AE diff    batch-compare.sh <dir>
3. DSSIM      dssim-compare.sh <dir>
4. Diagnose   Read ONLY diff images for FAIL positions
5. Fix        Targeted code change
6. Re-compare Repeat 2–3
7. LLM review Read ref+impl pairs for ALL positions (Phase E)
8. Gate       All three axes PASS → DONE

Section-level comparison (precise — preferred for post-gen verification)

1. Section compare  section-compare.sh <orig> <impl> <session>
   → Per-section AE + structure diff (SVG-as-text, layout type, height)
2. Transition compare  transition-compare.sh <orig> <impl> <session>
   → Per-element idle/hover style + timing diff
3. Fix          Targeted code change per failing section/element
4. Re-compare   Repeat 1–2
5. Gate         All sections PASS + all transitions PASS → DONE

Use section-level for ui-reverse-engineering Step 8b/8c. Use full-page for standalone /visual-debug invocations.

Three-axis verification (ALL required)

Axis Tool Catches Blind spot
Pixel AE Exact rendering diff Lottie frame differences (false positive)
Perceptual DSSIM Color/tone mismatch Missing content on same-color bg
Semantic LLM (Phase E) Missing sections, wrong content Slow, costs tokens

A position is PASS only when all three agree (or LLM explicitly approves a known difference).

Phase E: LLM Review (MANDATORY)

After AE + DSSIM, read every position's ref+impl pair. Judge PASS / PARTIAL / FAIL. Not optional — automated metrics can silently pass wrong results. ~44K tokens.

Thresholds

Metric Pass Fail
AE per image ≤ 500 > 500
SSIM per frame ≥ 0.995 < 0.995
Computed style diff 0 mismatches > 0

AE=500 allows anti-aliasing variance. Bump to 2000 for dynamic content.

Full verification

  • verification.md — Phase A/B (capture) + D (pixel-perfect gate) + auxiliary checks
  • comparison-fix.md — Phase C (AE+DSSIM comparison, computed-style diagnosis, Phase E LLM review, Phase H self-healing loop)

Browser cleanup (MANDATORY)

Every skill run MUST end with browser cleanup — success, failure, or interruption.

# Always close your own session(s) by name
agent-browser --session <session-name> close
  • Close every --session <name> you opened during the comparison
  • Run cleanup before returning control to the user, even on error/early exit
  • Unclosed sessions spawn Chrome Helper processes (GPU + Renderer) that persist indefinitely
  • Never use close --all — other Claude sessions may have active browsers. Only close sessions you own.

Integration

Skill Where
ui-reverse-engineering Step 8+9 Full verification procedure
ui-reverse-engineering Step T4 Phase D for transition resting states
ui-capture Phase 4A Phase D before compare.html
Standalone batch-scroll + batch-compare on any two URLs
Related skills
Installs
3
GitHub Stars
2
First Seen
Apr 14, 2026