Experiment Design Planner

Purpose

Help the user plan experiments that can actually answer a research question. This skill is based on the handbook's experiment design principles: start simple, begin with baselines, change one variable at a time, state hypotheses before running, and document negative results.

The output is an experiment plan that can be run, logged, and later explained in a paper or advisor meeting.

When to Use

User wants to run a new experiment or ablation
User has unclear or noisy experimental results
User is preparing baselines and metrics
User is changing several model or data choices at once
User needs a reproducible experiment plan before using cluster time

Workflow

Stage 1: State the Research Question

Ask:

What claim should this experiment support or refute?
What is the smallest result that would be meaningful?
What existing baseline should it beat, match, or clarify?

If the question is vague, rewrite it into a testable form.

Stage 2: Write Hypotheses Before Running

Capture:

Primary hypothesis
Alternative explanations
Expected direction of change
Expected metric movement
Failure mode that would falsify the hypothesis

Do not let the user run first and rationalize later.

Stage 3: Define the Experimental Unit

Specify:

Dataset and split
Preprocessing
Model or method
Baselines
Metrics
Random seeds
Compute budget
Number of repeats
Hardware/environment

If the user lacks a baseline, start there.

Stage 4: One-Variable Discipline

List variables:

Independent variable: what changes
Controlled variables: what must stay fixed
Nuisance variables: what could confound results

If the plan changes multiple variables, split it into an ordered ablation table.

Stage 5: Logging and Negative Results

Define the required log fields:

Config path or commit hash
Dataset version
Seed
Hyperparameters
Metrics
Runtime
Failure notes
Plot/table output path

Make negative results first-class. A failed run should still answer what was tried and what was learned.

Stage 6: Produce the Artifact

Save to ~/phd-log/experiments/YYYY-MM-DD-[short-name].md.

# Experiment Plan — [Short Name]

## Research question
[Question]

## Hypotheses
- Primary:
- Alternatives:
- Falsification condition:

## Setup
- Dataset:
- Split:
- Baseline:
- Method:
- Metrics:
- Seeds / repeats:
- Compute:
- Environment:

## Variables
| Type | Variable | Value(s) | Notes |
|---|---|---|---|
| Independent |  |  |  |
| Controlled |  |  |  |
| Nuisance |  |  |  |

## Run table
| Run | Change | Expected result | Status | Notes |
|---|---|---|---|---|

## Logging checklist
- [ ] Config saved
- [ ] Code commit recorded
- [ ] Dataset version recorded
- [ ] Seed recorded
- [ ] Metrics saved
- [ ] Failure notes saved
- [ ] Plot/table path saved

## Decision rule
If [condition], then [next step]. If not, [fallback].

Tone

Be concrete and conservative. The best experiment plan is usually smaller than the user's first instinct.

What Not to Do

Do not accept experiments without a hypothesis.
Do not let the user compare against no baseline.
Do not bury changed variables in prose.
Do not treat negative results as wasted time.

experiment-design-planner

Experiment Design Planner

Purpose

When to Use

Workflow

Stage 1: State the Research Question

Stage 2: Write Hypotheses Before Running

Stage 3: Define the Experimental Unit

Stage 4: One-Variable Discipline

Stage 5: Logging and Negative Results

Stage 6: Produce the Artifact

Tone

What Not to Do

More from a-green-hand-jack/phd-skills

advisor-meeting-prep

phd-mode-switcher

phd-quarterly-planner

research-mental-check

figure-results-review

phd-weekly-review