improvement-orchestrator
Improvement Orchestrator
Coordinates the full improvement pipeline: Generator → Discriminator → Evaluator → Executor → Gate.
When to Use
- Run a full improvement cycle on one or more skills
- Coordinate the 5-stage pipeline end-to-end (with optional evaluator)
- Retry failed improvements with trace-aware feedback (Ralph Wiggum loop)
When NOT to Use
- 只想检查 skill 质量评分 → use
improvement-learner - 只想手动给候选打分 → use
improvement-discriminator - 只想改一个文件 → use
improvement-executor - 只想查基准数据 → use
benchmark-store
Pipeline
propose → discriminate → evaluate* → execute → gate
↻ Ralph Wiggum: fail → inject trace → retry (max 3)
* evaluate is optional — skipped if no task_suite.yaml exists
CLI
python3 scripts/orchestrate.py \
--target /path/to/skill \
--state-root /path/to/state \
--max-retries 3 \
--auto
Output Artifacts
| Request | Deliverable |
|---|---|
| Full pipeline | JSON with all stage outputs, final scores, execution trace |
| Retry cycle | Updated candidates with injected failure traces |
Related Skills
- improvement-generator: Produces candidate proposals (stage 1)
- improvement-discriminator: Multi-reviewer panel scoring (stage 2)
- improvement-evaluator: Task suite execution validation (stage 3, optional)
- improvement-executor: Applies changes with backup/rollback (stage 4)
- improvement-gate: 6-layer quality gate (stage 5)
- benchmark-store: Frozen benchmarks and Pareto front data
References
- Architecture — System design and data flow
- Guardrails — Safety rules and protected targets
- End-to-End Demo — Complete walkthrough
More from lanyasheng/auto-improvement-orchestrator-skill
skill-distill
|
1improvement-gate
当执行完变更需要验证是否应保留、候选被标记 pending 需要人工审批、或想查看待审队列时使用。6 层机械门禁: Schema→Compile→Lint→Regression→Review→HumanReview,其中 Schema/Compile/Regression/Review 为阻塞层(失败即拒绝),Lint 和 HumanReview 为建议层(失败不阻塞但记录警告)。不用于打分(用 improvement-discriminator)或执行变更(用 improvement-executor)。
1prompt-hardening
硬化 agent prompt、system prompt、SOUL.md、AGENTS.md、cron prompt 使 LLM 可靠遵循指令。触发词:agent 不听话、忽略规则、绕过约束、prompt 优化、指令合规、规则强化、prompt 硬化、LLM 不遵守、模型违规、creative circumvention。Use when agent ignores rules, disobeys instructions, bypasses tool constraints, needs prompt optimization, instruction compliance improvement, or rule hardening. 不适用于代码生成、代码审查、测试编写等执行型任务。参见 improvement-orchestrator (用于 skill 质量改进)、code-review-enhanced (用于代码审查)。
1benchmark-store
当需要初始化基准数据库、对比 skill 评分与历史基线、查看 Pareto front 是否有维度回退、或查阅质量分级标准时使用。不用于给候选打分(用 improvement-discriminator)或自动改进(用 improvement-learner)。
1skill-forge
>
1improvement-evaluator
当需要验证 Skill 改进是否真正提升了 AI 执行效果时使用。通过预定义任务集(YAML)运行 AI 任务,判定 pass/fail,输出 execution_pass_rate。不用于文档结构评分(用 improvement-learner)或候选打分(用 improvement-discriminator)。
1