continuous-skill-optimizer
Continuous Skill Optimizer
You are an expert AI evaluations and prompt optimization engineer.
This skill implements autoresearch-style optimization for skill trigger quality and instruction fidelity. It conducts iterative experiments against an evaluation dataset to empirically improve a target skill.
Execution Flow
Execute these phases in order. Do not skip phases.
Phase 1: Guided Discovery
Conduct a setup interview to gather the experiment parameters:
- Target Skill: The directory path to the skill to optimize (e.g.,
plugins/my-plugin/skills/my-skill). - Eval Set Path: The path to the evaluation
.jsonlor.csvdataset (ask if they want to generate a default one first if they don't have it). - Loop Budget: How many iterations should the optimizer run? (e.g.,
max-iterations=5). - Target Variable: Are we optimizing the
description:(trigger phrase) or thebody(instructions)? - Auto-Apply: Should winning iterations automatically overwrite the source skill, or just be logged as recommendations?
Wait for the user's answers before proceeding.
Phase 2: Recap & Confirm
Summarize the parameters decided in Phase 1 back to the user:
- Target Skill: [Path]
- Eval Set: [Path]
- Budget: [N] iterations
- Auto-Apply: [Yes/No]
Ask: "Should I proceed with the optimization loop?"
Phase 3: Execute Optimization Loop
Once approved, execute the optimizer script.
# Example syntax:
python ${CLAUDE_PLUGIN_ROOT}/scripts/execute_optimizer.py \
--skill [target-skill] \
--evals [eval-set-path] \
--max-iterations [N] \
--auto-apply [true/false]
Under the Hood (Autoresearch Mechanics): The script runs a strict loop governed by these rules:
- Run and record a baseline evaluation.
- Change one dominant variable per iteration (e.g., description wording, scope, exclusions).
- Classify the iteration as
keep,discard, orcrash. - If it crashes/timeouts, it logs the failure and reverts to the last known-good state.
- All runs log a persistent ledger to
evals/results.tsv.
Phase 4: Post-Optimization Review
After execution, summarize the findings. If auto-apply was false, provide the winning description/body text and ask the user if they'd like you to manually apply it to the skill.
Advise the user to review the ledger at evals/results.tsv or run ./scripts/eval-viewer/generate_review.py for visual review of the iteration outcomes.
More from richfrem/agent-plugins-skills
markdown-to-msword-converter
Converts Markdown files to one MS Word document per file using plugin-local scripts. V2 includes L5 Delegated Constraint Verification for strict binary artifact linting.
52excel-to-csv
>
32zip-bundling
Create technical ZIP bundles of code, design, and documentation for external review or context sharing. Use when you need to package multiple project files into a portable `.zip` archive instead of a single Markdown file.
29learning-loop
(Industry standard: Loop Agent / Single Agent) Primary Use Case: Self-contained research, content generation, and exploration where no inner delegation is required. Self-directed research and knowledge capture loop. Use when: starting a session (Orientation), performing research (Synthesis), or closing a session (Seal, Persist, Retrospective). Ensures knowledge survives across isolated agent sessions.
26ollama-launch
Start and verify the local Ollama LLM server. Use when Ollama is needed for RLM distillation, seal snapshots, embeddings, or any local LLM inference — and it's not already running. Checks if Ollama is running, starts it if not, and verifies the health endpoint.
26spec-kitty-checklist
A standard Spec-Kitty workflow routine.
26