loop-execution-evaluator
Loop Execution Evaluator — Step 4: Dispatcher
This agent does NOT evaluate directly. It determines the track type and dispatches the correct specialized evaluator.
Why Specialized Evaluators?
Different track types need fundamentally different checks:
- A UI track needs design system adherence, visual consistency, responsive checks
- A feature track needs build integrity, type safety, code patterns
- An integration track needs API contracts, auth flows, error recovery
- A business logic track needs product rules, edge cases, state transitions
A generic checklist misses critical issues specific to each type.
Dispatch Logic
read_file the track's metadata.json and spec.md to determine the track type, then dispatch:
| Track Type | Keywords in spec/metadata | Evaluator |
|---|---|---|
| UI / Design | "screen", "component", "design system", "layout", "visual", "UI shell" | eval-ui-ux |
| Feature / Code | "implement", "feature", "refactor", "infrastructure", "hook", "store" | eval-code-quality |
| Integration | "Supabase", "Stripe", "Gemini", "API", "auth", "database", "webhook" | eval-integration |
| Business Logic | "generation", "lock", "dependency", "pricing", "tier", "pipeline", "download" | eval-business-logic |
Multi-Type Tracks
Some tracks need multiple evaluators. For example:
- A generator logic track →
eval-business-logic+eval-code-quality - An auth/DB integration track →
eval-integration+eval-code-quality - A UI shell track →
eval-ui-uxonly
When multiple evaluators apply, run them all. The track passes only if ALL evaluators pass.
Dispatch Workflow
1. read_file track metadata.json + spec.md
2. Determine track type(s)
3. Dispatch evaluator(s):
→ eval-ui-ux (if UI track)
→ eval-code-quality (if code/feature track)
→ eval-integration (if integration track)
→ eval-business-logic (if logic track)
4. Collect results from all dispatched evaluators
5. Aggregate into final verdict
Structural Checks (Always Run)
Regardless of track type, always verify these baseline checks:
| Check | Method |
|---|---|
| plan.md updated | All completed tasks marked [x] with commit SHA and summary |
| Scope alignment | No unplanned work added without documentation |
| No skipped tasks | All [ ] tasks either completed or documented as intentionally deferred |
| Build passes | npm run build exits 0 |
| Business docs in sync | If track made pricing/model/business decisions, verify docs are flagged for Step 5.5 sync |
Business Doc Sync Check
If the track made any business-impacting changes, verify:
- The executor's summary includes
Business Doc Sync Required: Yes - Affected documents are listed
- This flags the Conductor to run Step 5.5 (Business Doc Sync) before marking complete
What counts as business-impacting:
- Pricing tier, price point, or feature list changes
- AI model, SDK, or cost structure changes
- New package or product tier additions
- Asset pipeline changes (add/remove/modify assets)
- Persona, GTM, or revenue assumption changes
See ${CLAUDE_PLUGIN_ROOT}/skills/business-docs-sync/SKILL.md for the full registry.
Aggregated Verdict
## Execution Evaluation Report
**Track**: [track-id]
**Evaluator**: loop-execution-evaluator (dispatcher)
**Date**: [YYYY-MM-DD]
### Evaluators Dispatched
| Evaluator | Reason | Verdict |
|-----------|--------|---------|
| eval-ui-ux | Track builds P0 screens | PASS ✅ / FAIL ❌ |
| eval-code-quality | Track implements features | PASS ✅ / FAIL ❌ |
### Structural Checks
- plan.md updated: YES / NO
- Scope alignment: YES / NO
- Build passes: YES / NO
- Business doc sync needed: YES / NO (if YES, list affected docs)
### Final Verdict: PASS ✅ / FAIL ❌
All evaluators must PASS for the track to pass.
[If FAIL, aggregate all fix actions from all evaluators]
Metadata Checkpoint Updates
The execution evaluator MUST update the track's metadata.json at key points:
On Start
{
"loop_state": {
"current_step": "EVALUATE_EXECUTION",
"step_status": "IN_PROGRESS",
"step_started_at": "[ISO timestamp]",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "IN_PROGRESS",
"started_at": "[ISO timestamp]",
"agent": "loop-execution-evaluator"
}
}
}
}
On PASS
{
"loop_state": {
"current_step": "BUSINESS_SYNC",
"step_status": "NOT_STARTED",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "PASSED",
"completed_at": "[ISO timestamp]",
"verdict": "PASS",
"evaluators_run": [
{ "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
{ "evaluator": "eval-business-logic", "verdict": "PASS", "issues": [] }
],
"business_sync_required": true
},
"BUSINESS_SYNC": {
"status": "NOT_STARTED",
"required": true
}
}
}
}
On FAIL
{
"loop_state": {
"current_step": "FIX",
"step_status": "NOT_STARTED",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "FAILED",
"completed_at": "[ISO timestamp]",
"verdict": "FAIL",
"evaluators_run": [
{ "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
{ "evaluator": "eval-business-logic", "verdict": "FAIL", "issues": ["Business rule violation found"] }
],
"failure_items": [
"Fix business rule enforcement in resolver",
"Add test coverage for edge case"
]
},
"FIX": {
"status": "NOT_STARTED",
"cycle": 1
}
}
}
}
Update Protocol
- read_file current
metadata.json - Update
loop_state.checkpoints.EVALUATE_EXECUTIONwith results - If PASS + business sync needed: Set
current_steptoBUSINESS_SYNC - If PASS + no sync needed: Set
current_steptoCOMPLETE - If FAIL: Set
current_steptoFIX, incrementfix_cycle_countin loop_state - write_file back to
metadata.json
Handoff
- ALL PASS + No Business Doc Sync → Conductor marks track complete (Step 5)
- ALL PASS + Business Doc Sync Needed → Conductor runs Step 5.5 (Business Doc Sync) before marking complete
- ANY FAIL → Conductor dispatches
loop-fixerwith combined fix list
More from ibrahim-3d/conductor-orchestrator-superpowers
board-of-directors
Simulate a 5-member expert board deliberation for major decisions. Use when evaluating plans, architecture choices, feature designs, or any decision requiring multi-perspective expert analysis. Triggers: 'board review', 'get expert opinions', 'board meeting', 'director evaluation', 'consensus review'.
9conductor-orchestrator
Master coordinator for the Evaluate-Loop workflow v3. Supports GOAL-DRIVEN entry, PARALLEL execution via worker agents, BOARD OF DIRECTORS deliberation, and message bus coordination. Dispatches specialized workers dynamically, monitors via message bus, aggregates results. Uses metadata.json v3 for parallel state tracking. Use when: '/go <goal>', '/conductor implement', 'start track', 'run the loop', 'orchestrate', 'automate track'.
8eval-business-logic
Specialized business logic evaluator for the Evaluate-Loop. Use this for evaluating tracks that implement core product logic — pipelines, dependency resolution, state machines, pricing/tier enforcement, packaging. Checks feature correctness against product rules, edge cases, state transitions, data flow, and user journey completeness. Dispatched by loop-execution-evaluator when track type is 'business-logic', 'generator', or 'core-feature'. Triggered by: 'evaluate logic', 'test business rules', 'verify business rules', 'check feature'.
8executing-plans
Use when you have a written implementation plan to execute in a separate session with review checkpoints
7eval-integration
Specialized integration evaluator for the Evaluate-Loop. Use this for evaluating tracks that integrate external services — Supabase auth/DB, Stripe payments, Gemini API, third-party APIs. Checks API contracts, auth flows, data persistence, error recovery, environment config, and end-to-end flow integrity. Dispatched by loop-execution-evaluator when track type is 'integration', 'auth', 'payments', or 'api'. Triggered by: 'evaluate integration', 'test auth flow', 'check API', 'verify payments'.
7agent-factory
Creates specialized worker agents dynamically from templates. Use when orchestrator needs to spawn task-specific workers for parallel execution. Handles agent lifecycle: create -> execute -> cleanup.
7