aif-loop
Loop - Reflex Iteration Workflow
Run a result-focused iterative loop with strict phase contracts, evaluation rules, and persistent state between sessions.
Terminology:
- loop = one full execution for a task alias (stored in
run.json, identified byrun_id) - iteration = one cycle inside that loop
Core Idea
Each iteration executes 6 phases with parallel execution where possible:
PLAN- short plan for current iterationPRODUCE- produce oneartifact.md← runs in parallel with PREPAREPREPARE- generate check scripts and test definitions from rules ← runs in parallel with PRODUCEEVALUATE- run prepared checks + content rules against artifact, score result. Uses parallelTaskagents for independent check groupsCRITIQUE- precise issues + fixes (only if fail)REFINE- rewrite artifact using critique (only if fail)
PLAN
│
┌──────┴──────┐
↓ ↓ ← parallel (Task tool)
PRODUCE PREPARE
(artifact) (checks)
↓ ↓
└──────┬──────┘
↓
EVALUATE ← parallel check execution (Task tool)
┌───┼───┐
↓ ↓ ↓
exec content aggregate
└───┼───┘
↓
CRITIQUE (if fail)
↓
REFINE (if fail)
Stop when quality is good enough, no major issues remain, or iteration limit is reached.
Persistence Contract
Use exactly 3+1 files for state (where current.json exists only while a loop is active):
.ai-factory/evolution/current.json
.ai-factory/evolution/<task-alias>/run.json
.ai-factory/evolution/<task-alias>/history.jsonl
.ai-factory/evolution/<task-alias>/artifact.md
Do not create extra index files or per-iteration folder trees unless user explicitly asks.
File Roles
current.json: pointer to active loop only; delete it when loop becomescompleted/stopped/failedrun.json: single source of truth for current loop statehistory.jsonl: append-only event log (one JSON object per line)artifact.md: single source of truth for artifact content (written after PRODUCE and REFINE phases, never duplicated inrun.json)
Command Modes
Parse $ARGUMENTS:
status- show active loop status fromcurrent.jsonand stopresume [alias]- continue active loop or loop by aliasstop [reason]- stop active loop with reason (user_stopif omitted)new <task>or no mode + task text - start new looplist- list all task aliases with status (running/stopped/completed/failed)history [alias]- show event history for a loop (default: active loop)clean [alias|--all]- remove loop files for a stopped/completed/failed loop (requires user confirmation, always confirm before deleting)
If no task and no active loop exists, ask user for task prompt.
Step 0: Load Context
Read these files if present:
.ai-factory/DESCRIPTION.md.ai-factory/ARCHITECTURE.md.ai-factory/RULES.md
Use them to keep outputs aligned with project conventions.
Read .ai-factory/skill-context/aif-loop/SKILL.md — MANDATORY if the file exists.
This file contains project-specific rules accumulated by /aif-evolve from patches,
codebase conventions, and tech-stack analysis. These rules are tailored to the current project.
How to apply skill-context rules:
- Treat them as project-level overrides for this skill's general instructions
- When a skill-context rule conflicts with a general rule written in this SKILL.md, the skill-context rule wins (more specific context takes priority — same principle as nested CLAUDE.md files)
- When there is no conflict, apply both: general rules from SKILL.md + project rules from skill-context
- Do NOT ignore skill-context rules even if they seem to contradict this skill's defaults — they exist because the project's experience proved the default insufficient
- CRITICAL: skill-context rules apply to ALL outputs of this skill — including the generated artifact, run state, and evaluation criteria. If a skill-context rule says "artifact MUST include X" or "evaluation MUST check Y" — you MUST comply. Producing loop outputs that violate skill-context rules is a bug.
Enforcement: After generating any output artifact, verify it against all skill-context rules. If any rule is violated — fix the output before presenting it to the user.
Step 0.1: Handle Non-Iteration Commands
If command is status, stop, list, history, or clean, execute and stop:
status: readcurrent.json; if file exists, read pointedrun.jsonand displayalias | status | iteration | phase | current_step | last_score | updated_at; if file is missing, report that no loop is activestop [reason]: stop active running loop only; setrun.json.status = "stopped"andrun.json.stop.reason = <reason or "user_stop">, appendstoppedevent tohistory.jsonl, then deletecurrent.json(active pointer cleared) and exitlist: scan.ai-factory/evolution/directories, read eachrun.json, display table ofalias | status | iteration | last_score | updated_athistory [alias]: readhistory.jsonlfor the alias (or active loop), display formatted event timelineclean [alias|--all]: show what will be deleted, ask for explicit user confirmation viaAskUserQuestion, then delete loop directory. Only clean stopped/completed/failed loops — refuse to clean running loops. Updatecurrent.jsonif needed.
Step 1: Initialize or Resume Loop
1.1 Ensure directories
mkdir -p .ai-factory/evolution
1.2 Alias and IDs (new loop)
Generate:
task_alias: lowercase hyphen slug (3-64 chars)run_id:<task_alias>-<yyyyMMdd-HHmmss>
1.3 Write current.json
{
"active_run_id": "courses-api-ddd-20260218-120000",
"task_alias": "courses-api-ddd",
"status": "running",
"updated_at": "2026-02-18T12:00:00Z"
}
1.4 Write initial run.json
{
"run_id": "courses-api-ddd-20260218-120000",
"task_alias": "courses-api-ddd",
"status": "running",
"iteration": 1,
"max_iterations": 4,
"phase": "A",
"current_step": "PLAN",
"task": {
"prompt": "OpenAPI 3.1 spec + DDD notes + JSON examples",
"ideal_result": "..."
},
"criteria": {
"name": "loop_default_v1",
"version": 1,
"phase": {
"A": { "threshold": 0.8, "active_levels": ["A"] },
"B": { "threshold": 0.9, "active_levels": ["A", "B"] }
},
"rules": []
},
"plan": [],
"prepared_checks": null,
"evaluation": null,
"critique": null,
"stop": { "passed": false, "reason": "" },
"last_score": 0,
"stagnation_count": 0,
"created_at": "2026-02-18T12:00:00Z",
"updated_at": "2026-02-18T12:00:00Z"
}
1.5 Resume Logic
When resuming a loop:
- Read
run.jsonto getcurrent_stepanditeration - Read last event from
history.jsonlto confirm consistency - If
run.json.current_stepindicates a phase was interrupted:- Re-execute from that phase (do not skip)
PRODUCE_PREPARE: always re-run both PRODUCE and PREPARE (idempotent — artifact overwrites, checks regenerate)
- If
run.json.statusisstopped,completed, orfailed, inform user and suggestnew(forfailedruns, also show the lastphase_errorevent fromhistory.jsonlso user understands what went wrong)
Step 2: Interactive Setup (new loop)
Quick mode (default, confirmation-first)
If the task prompt contains enough context to infer task type and ideal result:
- Auto-detect task type from prompt (API spec, code, docs, config)
- Load matching template from
references/CRITERIA-TEMPLATES.md - Draft inferred rules, phase thresholds (fallback: A=0.8, B=0.9), and max iterations (default:
4) - Show inferred settings as a draft summary
- Always ask explicit confirmation of success criteria (rules/thresholds) via
AskUserQuestion, even if criteria were already present in the task text - Always ask explicit confirmation of max iterations via
AskUserQuestion, even if iteration count was already present in the task text - If user changes either criteria or max iterations, update the draft and re-confirm both fields
- Start iteration 1 only after both confirmations are explicit
- If task type cannot be auto-detected (ambiguous or mixed prompt), fall through to full setup immediately
Full setup
Critical guardrail:
- Always re-ask and explicitly confirm success criteria and max iterations, even if both are already written in the task prompt.
Ask concise setup questions before first iteration:
- Task type - what kind of artifact? (API spec, code, docs, config, other) - used to load template from
references/CRITERIA-TEMPLATES.md - Ideal result definition
- Mandatory checks (tests, schema/contract, specific requirements)
- Quality threshold (A/B phases)
- Max iterations (default:
4) - What counts as a major issue
- Explicit confirmation: "Confirm these success criteria?"
- Explicit confirmation: "Confirm max iterations = N?"
Generate evaluation rules from answers:
- Load matching template from
references/CRITERIA-TEMPLATES.mdas starting point - Add task-specific rules based on ideal result and mandatory checks
- Let user review and adjust rules before starting
Persist answers and generated rules inside run.json.criteria (snapshot for reproducibility).
Never treat criteria or iteration limits parsed from task text as final until the user explicitly confirms both.
Normalization rules before persisting:
run.json.max_iterationsis the single source of truth for iteration limit- every rule must be expanded to full RULE-SCHEMA format (
id,description,severity,weight,phase,check) - if template shorthand omitted
weight, derive from severity (fail=2,warn=1,info=0)
Step 3: Phase Contracts
Before running phases, load:
references/PHASE-CONTRACTS.md- strict I/O contracts for each phasereferences/RULE-SCHEMA.md- rule format and score calculation
3.1 Phases
PLAN- generates iteration plan (sequential)PRODUCE- generates artifact (parallel with PREPARE)PREPARE- generates check scripts/definitions from rules + task prompt (parallel with PRODUCE)EVALUATE- runs prepared checks + content rules, aggregates score (parallel check groups viaTask)CRITIQUE- identifies issues with fix instructions (sequential, only on fail)REFINE- applies fixes to artifact (sequential, only on fail)
3.2 Parallel Execution Model
Two levels of parallelism via Task tool:
- Inter-phase: PRODUCE and PREPARE run as parallel
Taskagents after PLAN completes. Both depend only on PLAN output. - Intra-phase: EVALUATE spawns parallel
Taskagents for independent check groups (executable checks via Bash, content rules via Read/Grep). Aggregates results into final score.
3.3 Phase Output Format
Each phase produces its defined output (see PHASE-CONTRACTS.md). No envelope wrapping. No router output.
Step 4: Iteration Execution
For each iteration:
- Set
run.json.current_step = "PLAN", run PLAN phase - Set
run.json.current_step = "PRODUCE_PREPARE", launch both as parallelTaskagents:- Task A (PRODUCE): generates artifact → writes to
artifact.md - Task B (PREPARE): generates check scripts/definitions from rules + plan
- Wait for both to complete
- Task A (PRODUCE): generates artifact → writes to
- Set
run.json.current_step = "EVALUATE", run EVALUATE phase:- Spawn parallel
Taskagents for independent check groups:- Executable checks (compile, lint, tests) →
TaskwithBash - Content rules (structure, completeness, style) →
TaskwithRead/Grep
- Executable checks (compile, lint, tests) →
- Aggregate results into score
- Spawn parallel
- If
passed=false:- Set
run.json.current_step = "CRITIQUE", run CRITIQUE phase - Set
run.json.current_step = "REFINE", run REFINE phase - Write updated artifact to
artifact.md - Increment iteration and continue
- Set
- If
phase=Aandpassed=true:- Switch to
phase=B, activate B-level rules - Set
run.json.current_step = "PREPARE", re-run PREPARE withphase=Bto materialize B-level checks (no PLAN/PRODUCE — artifact already passed A) - Set
run.json.current_step = "EVALUATE", run EVALUATE against the same artifact with B-level prepared checks - If B evaluation also passes → stop with success (
threshold_reached) - If B evaluation fails → continue to CRITIQUE → REFINE, then increment iteration
- Switch to
- If
phase=Bandpassed=true:- Stop with success (
threshold_reached)
- Stop with success (
Fallback to Sequential
If Task tool is unavailable or returns errors, fall back to sequential execution: PLAN → PRODUCE → PREPARE → EVALUATE → CRITIQUE → REFINE. The loop must work without parallelism.
Step 5: Stop Conditions
Stop when any condition is met:
phase=Bandpassed=true(reason=threshold_reached)- no
fail-severity rules failed in current evaluation (reason=no_major_issues) — even if score is below threshold, the artifact has no blocking issues and onlywarn/inforemain iteration >= run.max_iterations(reason=iteration_limit)- explicit user stop (
reason=user_stop) - stagnation detected (
reason=stagnation)
Stagnation rule
Track score progress:
delta = score - last_score- if
delta < 0.02and there are no severityfailblockers, incrementstagnation_count - if
stagnation_count >= 2, stop withstagnation
Step 6: Persistence Writes (every step)
After each phase output:
- Update
run.json(includingcurrent_step) - Append event to
history.jsonl - Update
current.json.updated_at - Write
artifact.mdto disk after PRODUCE and REFINE phases - Before REFINE overwrites
artifact.md, save a SHA-256 hash of the previous artifact in therefinement_doneevent payload as"previous_artifact_hash"(enables integrity verification without bloating history)
Event names:
run_startedplan_createdartifact_createdchecks_preparedevaluation_donecritique_donerefinement_donephase_switchediteration_advancedphase_errorstoppedfailed
history.jsonl example line:
{"ts":"2026-02-18T12:01:10Z","run_id":"courses-api-ddd-20260218-120000","iteration":1,"phase":"A","step":"EVALUATE","event":"evaluation_done","status":"ok","payload":{"score":0.72,"passed":false}}
Step 7: Post-Loop
After the loop stops (any reason):
- Display final state summary (
iteration,max_iterations,phase,final score,stop reason) - If
stop reason = iteration_limitand latest evaluation haspassed=false, include mandatory distance-to-success details:- active phase threshold and final score
- numeric gap to threshold (
threshold - score, floor at0) - remaining failed
fail-severity rule count + blocking rule IDs - rules progress (
passed_rules / total_rules)
- Ask user where to save the final artifact (default: keep in
.ai-factory/evolution/<alias>/artifact.md) - Offer to copy artifact to a user-specified path
- Suggest next skills based on artifact type:
- API spec ->
/aif-planto implement it - Code ->
/aif-verifyto check it - Docs ->
/aif-docsto integrate it
- API spec ->
- Update
run.json.statusbased on stop reason, and ifcurrent.jsonpoints to this loop, deletecurrent.json(no active loop remains):
| Stop reason | Status |
|---|---|
threshold_reached |
completed |
no_major_issues |
completed |
user_stop |
stopped |
iteration_limit |
stopped |
stagnation |
stopped |
phase_error |
failed |
Step 8: Response Format to User
Show a compact summary after each iteration — do NOT dump full run.json or artifact.md content into the conversation. The artifact is already on disk; duplicating it wastes context.
Iteration summary format
── Iteration {N}/{max} | Phase {A|B} | Score: {score} | {PASS|FAIL} ──
Plan: {1-line summary of plan focus}
Hash: {first 8 chars of artifact SHA-256}
Changed: {list of added/modified sections, or "initial generation"}
Failed: {comma-separated rule IDs, or "none"}
Warnings: {comma-separated rule IDs, or "none"}
Artifact: .ai-factory/evolution/<alias>/artifact.md
Hash— lets the user verify which version they're looking at without reading the full artifactChanged— shows what actually moved between iterations so regressions are visible from the summary alone
If passed=false, append a compact critique summary (rule ID + 1-line fix instruction per issue). Do not repeat the full artifact or full evaluation object.
When the loop terminates with reason=iteration_limit and passed=false, append a compact distance_to_success block to the final response.
Full output exceptions
Show the full artifact content (not just summary) in these cases only:
- Loop termination — the final iteration always outputs the complete artifact
- Phase A → B transition — show the phase-A-passing artifact in full once at the transition boundary for visibility (B-level evaluation still runs immediately per Step 4)
- Explicit user request — user asks to see the full artifact mid-loop
Step 9: Context Management
The loop generates significant context per iteration (subagent results, evaluation data, critique). After several iterations the conversation context grows large, degrading LLM quality.
All loop state is persisted to disk — clearing context loses nothing. The resume command fully reconstructs state from files.
When to recommend context clear
Recommend clearing context to the user in these situations:
- After iteration 2 — the midpoint of a default 4-iteration loop
- On Phase A → B transition — natural boundary, new evaluation scope begins
- After any iteration where
iteration >= 3— context is already heavy
How to recommend
After the iteration summary, append:
💡 Context is growing. Recommended: /clear then /aif-loop resume
All state is saved on disk — nothing will be lost.
Do not force or auto-clear. The user decides. If the user ignores the recommendation, continue normally.
Error Recovery
Invalid phase output
If a phase produces output that does not match its contract:
- Log the error to
history.jsonlwith eventphase_error - Retry the phase once with the same inputs
- If retry also fails, stop the loop with
reason=phase_errorand display the error
Corrupted run.json
If run.json is missing or unparseable:
- Read
history.jsonlto reconstruct the last known state - Rebuild
run.jsonfrom the most recent events (last iteration, phase, score, etc.) - If
history.jsonlis also missing/empty, inform user and suggest starting a new loop
Important Rules
run.jsonis the only source of current state truth (does NOT store artifact content)artifact.mdon disk is the single source of truth for artifact content — never duplicate it inrun.jsonhistory.jsonlis append-only; do not edit old events- Keep loop fast: short plans, targeted critique, minimal rewrites
- Do not create extra files beyond the 3+1 persistence files
- Evaluator must remain strict and non-creative
- Refiner changes only what is needed to pass failed rules
- Start simple and add complexity only when metrics show need
- Retry failed phases exactly once before stopping
- Use compact iteration summaries by default (Step 8). Full artifact output is allowed only in Step 8 exceptions; never dump full
run.jsoninto conversation. - Recommend context clear at strategic points (Step 9) — after iteration 2, on phase transition, or when iteration >= 3
Examples
/aif-loop new OpenAPI 3.1 spec + DDD notes + JSON examples
/aif-loop resume
/aif-loop resume courses-api-ddd
/aif-loop status
/aif-loop stop
/aif-loop list
/aif-loop history
/aif-loop history courses-api-ddd
/aif-loop clean courses-api-ddd
/aif-loop clean --all