eval-integration
Integration Evaluator Agent
Specialized evaluator for tracks that integrate external services — Supabase, Stripe, Gemini, or any third-party API.
When This Evaluator Is Used
Dispatched by loop-execution-evaluator when the track involves:
- Authentication or database integration
- Payment processing integration
- AI/ML API integration
- Any external API connection
Inputs Required
- Track's
spec.mdandplan.md - Environment config (
.env.example, env variable documentation) - API client code (
src/lib/) - Database schema (if Supabase)
- Webhook handlers (if Stripe)
Evaluation Passes (6 checks)
Pass 1: API Contract Verification
| Check | What to Look For |
|---|---|
| Request shapes | API calls send correct payload structure |
| Response handling | Responses parsed with correct types |
| Error responses | 4xx/5xx errors handled with user-friendly messaging |
| Rate limits | Rate limit handling present (retry, backoff, queue) |
| Timeout | Reasonable timeout set on API calls |
| Auth headers | Bearer token / API key sent correctly |
### API Contracts: PASS ✅ / FAIL ❌
- Endpoints verified: [count]
- Missing error handling: [list]
- Type mismatches: [list]
Pass 2: Authentication Flow
| Check | What to Look For |
|---|---|
| Sign up | Creates user, stores token, redirects to dashboard |
| Sign in | Validates credentials, stores token, redirects |
| Sign out | Clears token, redirects to home |
| Token refresh | Handles expired tokens (refresh or re-auth) |
| Protected routes | Unauthenticated users redirected to login |
| OAuth | Third-party login flow (if applicable) |
### Auth Flow: PASS ✅ / FAIL ❌
- Flows tested: [sign up / sign in / sign out / token refresh]
- Broken flows: [list]
- Token handling: [correct / issues]
Pass 3: Data Persistence & Schema Hygiene
CRUD Operations:
| Check | What to Look For |
|---|---|
| Create | Data saved correctly to database/storage |
| read_file | Data retrieved and rendered correctly |
| Update | Changes persisted on save |
| Delete | Records removed, UI reflects deletion |
| Relationships | Foreign keys / joins working correctly |
| Storage | File uploads stored and retrievable (if applicable) |
Database Schema Quality (MANDATORY for all new tables/migrations):
| Check | Requirement | Why |
|---|---|---|
| Timestamps | created_at, updated_at on ALL mutable tables |
Debugging, audit trail, cache invalidation |
| Primary keys | UUID with default OR auto-increment | Data uniqueness |
| Foreign keys | Explicit cascade rules (on delete cascade) |
Prevent orphaned data |
| Indexes | Index ALL foreign keys | Query performance |
| Null constraints | New columns nullable OR have defaults | Backward compatibility |
| Unique constraints | Composite uniques where needed | Data integrity |
| Version history | JSONB column for flexible history | Schema evolution |
Schema Anti-Patterns to Flag:
-- ❌ BAD: No timestamps
create table brands (
id uuid primary key,
name text
);
-- ✅ GOOD: Complete schema
create table brands (
id uuid primary key default gen_random_uuid(),
name text not null,
created_at timestamptz default now() not null,
updated_at timestamptz default now() not null
);
-- ❌ BAD: Foreign key without cascade
brand_id uuid references brands(id)
-- ✅ GOOD: Explicit cascade
brand_id uuid references brands(id) on delete cascade not null
-- ❌ BAD: New required column (breaks existing data)
alter table assets add column image_url text not null;
-- ✅ GOOD: Nullable or has default
alter table assets add column locked boolean default false;
### Data Persistence & Schema: PASS ✅ / FAIL ❌
- CRUD operations: [which work / which fail]
- Data integrity: [any corruption or loss]
- Storage: [files accessible / issues]
- **Tables missing timestamps: [count] — [list]**
- **Foreign keys without indexes: [count] — [list]**
- **Migrations without defaults: [count] — [list]**
- **Orphaned data risk: [YES/NO — describe]**
Pass 4: Error Recovery
| Check | What to Look For |
|---|---|
| Network failure | Offline/timeout → user sees error, can retry |
| Invalid data | Malformed responses → graceful fallback |
| Auth failure | Expired token → redirect to login, not crash |
| Payment failure | Declined card → clear message, can retry |
| API down | Service unavailable → error state, not blank screen |
| Partial failure | One API fails, others still work |
### Error Recovery: PASS ✅ / FAIL ❌
- Scenarios tested: [list]
- Unhandled failures: [list]
- User messaging: [clear / missing]
Pass 5: Environment Configuration
| Check | What to Look For |
|---|---|
.env.example |
All required variables documented |
| No secrets in code | No API keys, tokens, or passwords in source files |
| Environment switching | Dev/staging/prod configs separate |
| Missing vars | App handles missing env vars gracefully (error, not crash) |
### Environment: PASS ✅ / FAIL ❌
- Variables documented: [YES/NO]
- Secrets in code: [NONE / list files with exposed secrets]
- Missing var handling: [graceful / crashes]
Pass 6: End-to-End Flow
Walk through the complete user journey that involves this integration:
| Flow | Steps to Verify |
|---|---|
| Auth flow | Landing → Sign Up → Verify → Dashboard |
| Payment flow | Select plan → Checkout → Payment → Confirmation |
| Generation flow | Form → Generate → View → Download |
### E2E Flow: PASS ✅ / FAIL ❌
- Flow tested: [describe]
- Steps completed: [X]/[Y]
- Broken at step: [which step, if any]
Verdict Template
## Integration Evaluation Report
**Track**: [track-id]
**Evaluator**: eval-integration
**Date**: [YYYY-MM-DD]
**Service**: [Supabase/Stripe/Gemini/etc.]
### Results
| Pass | Status | Issues |
|------|--------|--------|
| 1. API Contracts | PASS/FAIL | [details] |
| 2. Auth Flow | PASS/FAIL | [details] |
| 3. Data Persistence | PASS/FAIL | [details] |
| 4. Error Recovery | PASS/FAIL | [details] |
| 5. Environment | PASS/FAIL | [details] |
| 6. E2E Flow | PASS/FAIL | [details] |
### Verdict: PASS ✅ / FAIL ❌
[If FAIL, list specific fix actions for loop-fixer]
Handoff
- PASS → Return to
loop-execution-evaluator→ Conductor marks complete - FAIL → Return to
loop-execution-evaluator→ Conductor dispatchesloop-fixer
More from ibrahim-3d/conductor-orchestrator-superpowers
board-of-directors
Simulate a 5-member expert board deliberation for major decisions. Use when evaluating plans, architecture choices, feature designs, or any decision requiring multi-perspective expert analysis. Triggers: 'board review', 'get expert opinions', 'board meeting', 'director evaluation', 'consensus review'.
9conductor-orchestrator
Master coordinator for the Evaluate-Loop workflow v3. Supports GOAL-DRIVEN entry, PARALLEL execution via worker agents, BOARD OF DIRECTORS deliberation, and message bus coordination. Dispatches specialized workers dynamically, monitors via message bus, aggregates results. Uses metadata.json v3 for parallel state tracking. Use when: '/go <goal>', '/conductor implement', 'start track', 'run the loop', 'orchestrate', 'automate track'.
8eval-business-logic
Specialized business logic evaluator for the Evaluate-Loop. Use this for evaluating tracks that implement core product logic — pipelines, dependency resolution, state machines, pricing/tier enforcement, packaging. Checks feature correctness against product rules, edge cases, state transitions, data flow, and user journey completeness. Dispatched by loop-execution-evaluator when track type is 'business-logic', 'generator', or 'core-feature'. Triggered by: 'evaluate logic', 'test business rules', 'verify business rules', 'check feature'.
8executing-plans
Use when you have a written implementation plan to execute in a separate session with review checkpoints
7agent-factory
Creates specialized worker agents dynamically from templates. Use when orchestrator needs to spawn task-specific workers for parallel execution. Handles agent lifecycle: create -> execute -> cleanup.
7context-loader
Load project context efficiently for Conductor workflows. Use when starting work on a track, implementing features, or needing project context without consuming excessive tokens.
7