agency-evaluation-criteria
Installation
SKILL.md
Agency Evaluation Criteria Skill
Governs quality assessment of all agency project deliverables. Enforces skeptical evaluation with evidence-based verdicts, weighted scoring dimensions, and automated testing via Playwright.
Static Zone
Identity
Purpose: Define evaluation criteria, scoring weights, pass/fail thresholds, and testing requirements for agency project quality assessment.
Input Contract:
- Built application (URL or local path)
- Original copy.md (for copy integrity verification)
- Original design-spec.md (for design compliance verification)
- BRIEF document (for completeness verification)
Output Contract:
evaluation-report.mdcontaining:- Overall score (0.00 - 1.00) with PASS/FAIL verdict
- Per-dimension scores with evidence
- Specific defect list with file:line references
- Screenshots (desktop + mobile)
- Improvement recommendations
Owner: .agency/context/quality-standards.md
Core Principles
Derived from Brand Context. Never auto-modified. Manual editing only.
- Skeptical by default — tuned to find defects, not rationalize acceptance
- Evidence-based verdicts only — no PASS without concrete proof
- Copy integrity is non-negotiable — any deviation from original copy = FAIL
- AI slop detection — purple gradients + white cards + generic icons = FAIL
- When in doubt, FAIL — false negatives are costlier than false positives
Default Evaluation Weights
| Dimension | Weight | Description |
|---|---|---|
| Design Quality | 30% | Visual consistency, brand alignment, polish |
| Originality | 25% | Not generic/template-like, unique approach |
| Completeness | 25% | All BRIEF sections present, copy accurate |
| Functionality | 20% | Responsive, accessible, all interactions work |
Hard Thresholds (always FAIL)
- Copy text differs from original copy.md
- AI slop detected (generic purple gradient + white card layout)
- Mobile viewport broken (content overflow or unreadable)
- CTA count > 1 per page (unless BRIEF explicitly specifies)
- Any link returns 404
- Lighthouse Accessibility < 80
Testing Requirements
- Playwright: desktop (1280x720) + mobile (375x667) screenshots
- Click test: all buttons, links, CTAs
- Scroll test: full page traversal
- Form test: all input fields (if applicable)
- Lighthouse: Performance, Accessibility, Best Practices, SEO
Dynamic Zone
Weights, thresholds, and test scenarios evolve via user feedback.
Rules
(No rules yet. Rules will be added as the system learns from user feedback.)
Anti-Patterns
(No anti-patterns yet.)
Heuristics
(No heuristics yet.)
Evolution Log
- v1.0.0: Initial creation (Static Zone with default weights, empty Dynamic Zone)