aeo-spec-validator
AEO Spec Validator
Purpose: Validate task specifications and score them 0-100 to ensure they're sufficiently defined before execution.
When to Use
Invoke this skill when:
- User provides a task description
- Before starting any implementation work
- When task is ambiguous or unclear
Scoring System (0-100)
Clarity Indicators (0-50 points)
Objective Clarity (15 points):
- 15 pts: Explicit, unambiguous objective
- Example: "Add email validation to signup form with regex check"
- 10 pts: Clear but minor ambiguities
- Example: "Add email validation to signup"
- 5 pts: Vague objective
- Example: "Improve signup process"
- 0 pts: No clear objective
Acceptance Criteria (15 points):
- 15 pts: Specific, testable criteria defined
- Example: "Must validate RFC 5322 format, reject + aliases, show inline errors"
- 10 pts: General criteria mentioned
- Example: "Must validate email format and show errors"
- 5 pts: Implied criteria
- Example: "Should work for valid emails"
- 0 pts: No criteria mentioned
Context Provided (10 points):
- 10 pts: Full context (where, why, constraints)
- Example: "For the signup form in /pages/auth/signup.tsx, using existing validator utility"
- 5 pts: Partial context
- Example: "For the signup form"
- 0 pts: No context
Dependencies Identified (10 points):
- 10 pts: All dependencies listed (libraries, services, APIs)
- Example: "Uses validator.js library, calls POST /api/validate"
- 5 pts: Some dependencies mentioned
- Example: "Uses validator library"
- 0 pts: No dependencies mentioned
Quality Indicators (0-30 points)
Tech Stack Specified (10 points):
- 10 pts: Specific technologies/libraries named
- 5 pts: General tech mentioned (e.g., "use a validation library")
- 0 pts: No tech mentioned
Test Requirements (10 points):
- 10 pts: Test cases specified
- Example: "Test valid emails, invalid formats, edge cases (+ alias)"
- 5 pts: General testing mentioned
- Example: "Should have tests"
- 0 pts: No testing mentioned
Performance/Security (10 points):
- 10 pts: Explicit requirements
- Example: "Must reject in <100ms, prevent email injection"
- 5 pts: General concerns mentioned
- Example: "Should be fast and secure"
- 0 pts: No mention
Risk Assessment (0-20 points)
Scope (10 points):
- 10 pts: Small, well-defined scope (1-2 files, <100 LOC)
- 5 pts: Medium scope (3-5 files, 100-500 LOC)
- 0 pts: Large scope (5+ files, 500+ LOC)
Complexity (10 points):
- 10 pts: Simple CRUD or clear logic
- 5 pts: Moderate complexity (multiple systems, integration)
- 0 pts: High complexity (architectural changes, unknowns)
Ambiguity Detection
Automatically detect and penalize these red flags:
Subjective Terms (-5 each):
- "fast", "quick", "performant"
- "good", "better", "optimal"
- "simple", "clean", "elegant"
- "user-friendly", "intuitive"
Vague Verbs (-10 each):
- "improve" (without specifics)
- "optimize" (without metrics)
- "enhance" (without details)
- "fix" (without describing what's broken)
Missing Constraints (-5 each):
- No performance requirements
- No error handling specified
- No edge cases mentioned
- No security considerations (for sensitive areas)
Scoring Examples
Example 1: Well-Defined Spec (Score: 92/100)
Task: Add email validation to the signup form in /pages/auth/signup.tsx
Requirements:
- Validate using RFC 5322 format via validator.js library
- Reject email addresses with + aliases
- Show inline error message "Invalid email format" on blur
- Call POST /api/validate-email to check if already registered
- Tests: valid emails, invalid formats, + alias rejection, duplicates
Score Breakdown:
- Objective clarity: 15/15 (explicit)
- Acceptance criteria: 15/15 (specific, testable)
- Context: 10/10 (file location, existing utility)
- Dependencies: 10/10 (validator.js, API endpoint)
- Tech stack: 10/10 (validator.js named)
- Test requirements: 10/10 (specific test cases)
- Performance/security: 7/10 (missing perf req)
- Scope: 10/10 (single file)
- Complexity: 5/10 (integration but clear)
Total: 92/100 → PROCEED
Example 2: Poor Spec (Score: 28/100)
Task: Improve the signup
Score Breakdown:
- Objective clarity: 5/15 (vague)
- Acceptance criteria: 0/15 (none)
- Context: 0/10 (none)
- Dependencies: 0/10 (none)
- Tech stack: 0/10 (none)
- Test requirements: 0/10 (none)
- Performance/security: 0/10 (none)
- Scope: 8/10 (assume small)
- Complexity: 5/10 (assume simple)
Ambiguity Penalties:
- "Improve" (vague verb): -10
- "signup" (subjective good?): 0
Total: 28/100 → REFUSE
Feedback:
❌ SPEC INSUFFICIENT (28/100)
Missing:
• Specific acceptance criteria
• What to improve about signup?
• Context (which signup flow?)
• Dependencies and tech stack
• Test requirements
Please provide:
1. What specific improvement is needed?
2. Acceptance criteria for "done"
3. Which signup form/page?
4. Any constraints or requirements
Actions by Score Range
80-100: PROCEED
- Well-defined spec
- Proceed with confidence calculation
- Note: "Spec score: XX/100 - well-defined"
60-79: MINOR GAPS
- Generally clear, some details missing
- Proceed but note assumptions
- Format:
⚠️ SPEC HAS MINOR GAPS (68/100) Assumptions: • Using existing test framework • Standard error handling • No special performance requirements Proceeding with these assumptions. Correct if wrong.
40-59: MAJOR GAPS
- Significant ambiguities
- Ask clarifying questions before proceeding
- Format:
❌ SPEC NEEDS CLARIFICATION (45/100) Missing Details: • Which file(s) should be modified? • What validation library to use? • Acceptance criteria not specified • No test requirements Please clarify: 1. Where should this be implemented? 2. What tech stack/libraries? 3. What defines "done"?
< 40: UNACCEPTABLE
- Too vague to execute
- Refuse task
- Request complete spec
- Format:
❌ CANNOT PROCEED - SPEC TOO UNCLEAR (28/100) This task is too vague. Please provide: 1. **Objective:** What exactly needs to be done? 2. **Acceptance Criteria:** How do we know it's done? 3. **Context:** Where/why is this needed? 4. **Dependencies:** What libraries/services? Example of a good spec: "Add email validation to /pages/auth/signup.tsx using validator.js. Must validate RFC 5322 format, reject + aliases, show inline errors. Tests for valid, invalid, and duplicate emails."
Integration Flow
- Invoke: Called by aeo-core during Phase 0
- Score: Analyze task and calculate score
- Return: Score + feedback if needed
- Decision: aeo-core uses score for confidence calculation
Examples for Reference
Show these examples if user asks for clarification:
Good Spec Template
[Task Name]
**Objective:** [Specific action to take]
**Location:** [File paths, components, modules]
**Requirements:**
- [Requirement 1]
- [Requirement 2]
**Dependencies:**
- Libraries: [list]
- Services: [list]
- APIs: [list]
**Acceptance Criteria:**
- [Criteria 1 - testable]
- [Criteria 2 - testable]
**Tests:**
- [Test case 1]
- [Test case 2]
**Constraints:**
- Performance: [requirements]
- Security: [requirements]
Common Mistakes
❌ "Make it faster" ✅ "Reduce API response time from 2s to <500ms by adding caching"
❌ "Fix the bug" ✅ "Fix null reference error in UserService.getUser() when user ID not found"
❌ "Add authentication" ✅ "Add JWT authentication to /api/* routes using bcrypt for password hashing"