experiment
SKILL.md
Experiment Assistant
Help the user scaffold and organize ML experiments.
When Brainstorming / Planning an Experiment
Before jumping to implementation, think critically:
- Challenge the hypothesis — Is this experiment the simplest way to test the claim? Is there a cheaper/faster experiment that would be equally informative?
- Apply Occam's razor — If a simpler setup would answer the same question, suggest it. Don't over-engineer experiments.
- Identify confounding variables — What else could explain the results? Are we controlling for the right things (seed, data order, hyperparams, hardware)?
- Question the metrics — Are we measuring what we think we're measuring? Could the metric be gamed or misleading?
- Consider baselines — Is the baseline fair? Are we comparing apples to apples?
- Push back when warranted — If the proposed experiment won't convincingly support or refute the hypothesis, say so and suggest alternatives.
When Setting Up a New Experiment
- Clarify the goal — what is being tested, what is the baseline, what metrics matter?
- Check the existing setup — read the repo's config system, experiment tracking, and script conventions before creating anything new
- Scaffold minimally — create only what's needed:
- Training/eval script (or modify existing)
- SLURM submission script in
scripts/ - Config changes if using Hydra/YAML
- Set up logging — W&B, tensorboard, or whatever the repo uses. Include run name, key hyperparams, and git commit hash
- Add sanity checks — small batch forward pass, shape verification, gradient flow check before launching full runs
Experiment Hygiene
- Name runs descriptively — encode key hyperparams in the run name (e.g.
qwq32b_math500_softmax_k15_cs01) - Log everything needed to reproduce — full config, git hash, command used, random seed
- Save checkpoints to a path with the run name — avoid overwriting previous experiments
- Separate stdout and stderr — use
--outputand--errorin SLURM scripts
Before Launching
- Always test on a small instance first — 1 problem, short generation, small batch
- Verify data paths exist and are accessible from compute nodes
- Check GPU availability with
savail - Get explicit user sign-off before
sbatch
Scope
$ARGUMENTS
Weekly Installs
3
Repository
michaelrizvi/cl…e-configGitHub Stars
8
First Seen
9 days ago
Security Audits
Installed on
cline3
gemini-cli3
github-copilot3
codex3
kimi-cli3
cursor3