experiment-design-planner

Installation
SKILL.md

Experiment Design Planner

Purpose

Help the user plan experiments that can actually answer a research question. This skill is based on the handbook's experiment design principles: start simple, begin with baselines, change one variable at a time, state hypotheses before running, and document negative results.

The output is an experiment plan that can be run, logged, and later explained in a paper or advisor meeting.

When to Use

  • User wants to run a new experiment or ablation
  • User has unclear or noisy experimental results
  • User is preparing baselines and metrics
  • User is changing several model or data choices at once
  • User needs a reproducible experiment plan before using cluster time

Workflow

Stage 1: State the Research Question

Ask:

  • What claim should this experiment support or refute?
  • What is the smallest result that would be meaningful?
  • What existing baseline should it beat, match, or clarify?

If the question is vague, rewrite it into a testable form.

Stage 2: Write Hypotheses Before Running

Capture:

  • Primary hypothesis
  • Alternative explanations
  • Expected direction of change
  • Expected metric movement
  • Failure mode that would falsify the hypothesis

Do not let the user run first and rationalize later.

Stage 3: Define the Experimental Unit

Specify:

  • Dataset and split
  • Preprocessing
  • Model or method
  • Baselines
  • Metrics
  • Random seeds
  • Compute budget
  • Number of repeats
  • Hardware/environment

If the user lacks a baseline, start there.

Stage 4: One-Variable Discipline

List variables:

  • Independent variable: what changes
  • Controlled variables: what must stay fixed
  • Nuisance variables: what could confound results

If the plan changes multiple variables, split it into an ordered ablation table.

Stage 5: Logging and Negative Results

Define the required log fields:

  • Config path or commit hash
  • Dataset version
  • Seed
  • Hyperparameters
  • Metrics
  • Runtime
  • Failure notes
  • Plot/table output path

Make negative results first-class. A failed run should still answer what was tried and what was learned.

Stage 6: Produce the Artifact

Save to ~/phd-log/experiments/YYYY-MM-DD-[short-name].md.

# Experiment Plan — [Short Name]

## Research question
[Question]

## Hypotheses
- Primary:
- Alternatives:
- Falsification condition:

## Setup
- Dataset:
- Split:
- Baseline:
- Method:
- Metrics:
- Seeds / repeats:
- Compute:
- Environment:

## Variables
| Type | Variable | Value(s) | Notes |
|---|---|---|---|
| Independent |  |  |  |
| Controlled |  |  |  |
| Nuisance |  |  |  |

## Run table
| Run | Change | Expected result | Status | Notes |
|---|---|---|---|---|

## Logging checklist
- [ ] Config saved
- [ ] Code commit recorded
- [ ] Dataset version recorded
- [ ] Seed recorded
- [ ] Metrics saved
- [ ] Failure notes saved
- [ ] Plot/table path saved

## Decision rule
If [condition], then [next step]. If not, [fallback].

Tone

Be concrete and conservative. The best experiment plan is usually smaller than the user's first instinct.

What Not to Do

  • Do not accept experiments without a hypothesis.
  • Do not let the user compare against no baseline.
  • Do not bury changed variables in prose.
  • Do not treat negative results as wasted time.
Related skills

More from a-green-hand-jack/phd-skills

Installs
2
GitHub Stars
1
First Seen
Apr 25, 2026