skills/cklxx/elephant.ai/eval-systematic-optimization

eval-systematic-optimization

SKILL.md

eval-systematic-optimization

Run baseline evaluation and failure clustering for foundation-suite.

Requirements

  • Go toolchain available (go in PATH).
  • Repo root as working directory (or pass cwd).

Constraints

  • Baseline command timeout: 600s.
  • Default baseline output path: /tmp/foundation-suite-<tag>-baseline.
  • analyze requires a valid JSON result file path.
  • Focus is conflict-family optimization, not single-case overfitting.

Usage

# Run baseline
python3 skills/eval-systematic-optimization/run.py '{"action":"baseline","tag":"r12"}'

# Analyze failures
python3 skills/eval-systematic-optimization/run.py '{"action":"analyze","result_file":"/tmp/foundation-suite-r12-baseline/foundation_suite_cases.json"}'
Weekly Installs
9
GitHub Stars
8
First Seen
13 days ago
Installed on
gemini-cli9
opencode9
codebuddy9
github-copilot9
codex9
kimi-cli9