skill-evals-run
SKILL.md
Skill Evals Run
Run the local skill-loading eval suite with the shell runner.
Prerequisites
opencodeis installed and on PATH.- Provider config is available under
~/.config/opencode/(used even with--isolate-config). - If using
--disable-models-fetch,~/.cache/opencode/models.jsonexists and includes the target model. - Auth/credentials are present (typically
~/.local/share/opencode/auth.json). - Network access is available for model calls.
Command
Run:
evals/skill-loading/opencode_skill_eval_runner.sh \
--repo "$PWD" \
--dataset evals/skill-loading/opencode_skill_loading_eval_dataset.jsonl \
--matrix evals/skill-loading/opencode_skill_eval_matrix.json \
--disable-models-fetch \
--isolate-config \
--parallel 3
Arguments
If the user provides any of the following flags, append them to the command:
--filter-id <regex>--filter-category <substring>--parallel <n>
If --parallel is omitted, keep the default of 3.
After the run
- Summarize PASS/FAIL counts and list failed case IDs.
- If failures exist (PASS/FAIL, not ERROR), reference
evals/skill-loading/docs/skill-optimization-steering.mdand suggest the next remediation step. - If any cases are ERROR, do not suggest optimization. Instead, inspect
evals/skill-loading/.tmp/opencode-eval-results/<run>/results.jsonand any traces to identify the crash, then re-run the evals once the error is resolved.
Notes
- Run from the repo root so relative paths resolve.
--isolate-configalso disables project config, so no extra flag is required to avoid loading repo config/plugins during evals.- With
--disable-models-fetch, the runner falls back to~/.cache/opencode/models.jsonwhen present. Use--models-url file://...if you need a different cache file. - Include the exact command used in the response.
Weekly Installs
1
Repository
chandima/opencode-configGitHub Stars
1
First Seen
4 days ago
Security Audits
Installed on
amp1
cline1
openclaw1
opencode1
cursor1
kimi-cli1