experiment-design-planner
Experiment Design Planner
Purpose
Help the user plan experiments that can actually answer a research question. This skill is based on the handbook's experiment design principles: start simple, begin with baselines, change one variable at a time, state hypotheses before running, and document negative results.
The output is an experiment plan that can be run, logged, and later explained in a paper or advisor meeting.
When to Use
- User wants to run a new experiment or ablation
- User has unclear or noisy experimental results
- User is preparing baselines and metrics
- User is changing several model or data choices at once
- User needs a reproducible experiment plan before using cluster time
Workflow
Stage 1: State the Research Question
Ask:
- What claim should this experiment support or refute?
- What is the smallest result that would be meaningful?
- What existing baseline should it beat, match, or clarify?
If the question is vague, rewrite it into a testable form.
Stage 2: Write Hypotheses Before Running
Capture:
- Primary hypothesis
- Alternative explanations
- Expected direction of change
- Expected metric movement
- Failure mode that would falsify the hypothesis
Do not let the user run first and rationalize later.
Stage 3: Define the Experimental Unit
Specify:
- Dataset and split
- Preprocessing
- Model or method
- Baselines
- Metrics
- Random seeds
- Compute budget
- Number of repeats
- Hardware/environment
If the user lacks a baseline, start there.
Stage 4: One-Variable Discipline
List variables:
- Independent variable: what changes
- Controlled variables: what must stay fixed
- Nuisance variables: what could confound results
If the plan changes multiple variables, split it into an ordered ablation table.
Stage 5: Logging and Negative Results
Define the required log fields:
- Config path or commit hash
- Dataset version
- Seed
- Hyperparameters
- Metrics
- Runtime
- Failure notes
- Plot/table output path
Make negative results first-class. A failed run should still answer what was tried and what was learned.
Stage 6: Produce the Artifact
Save to ~/phd-log/experiments/YYYY-MM-DD-[short-name].md.
# Experiment Plan — [Short Name]
## Research question
[Question]
## Hypotheses
- Primary:
- Alternatives:
- Falsification condition:
## Setup
- Dataset:
- Split:
- Baseline:
- Method:
- Metrics:
- Seeds / repeats:
- Compute:
- Environment:
## Variables
| Type | Variable | Value(s) | Notes |
|---|---|---|---|
| Independent | | | |
| Controlled | | | |
| Nuisance | | | |
## Run table
| Run | Change | Expected result | Status | Notes |
|---|---|---|---|---|
## Logging checklist
- [ ] Config saved
- [ ] Code commit recorded
- [ ] Dataset version recorded
- [ ] Seed recorded
- [ ] Metrics saved
- [ ] Failure notes saved
- [ ] Plot/table path saved
## Decision rule
If [condition], then [next step]. If not, [fallback].
Tone
Be concrete and conservative. The best experiment plan is usually smaller than the user's first instinct.
What Not to Do
- Do not accept experiments without a hypothesis.
- Do not let the user compare against no baseline.
- Do not bury changed variables in prose.
- Do not treat negative results as wasted time.
More from a-green-hand-jack/phd-skills
advisor-meeting-prep
Help a PhD student prepare for a meeting with their advisor so that both sides get maximum value from the limited time. Use this skill whenever the user has an upcoming advisor meeting, lab meeting presentation, or committee meeting, and needs help structuring what to bring. Trigger on phrases like "meeting with my advisor", "advisor meeting tomorrow", "what do I show my PI", "prepare for lab meeting", "committee meeting prep", "I meet my advisor in", or whenever the user expresses anxiety about an upcoming research check-in. Also trigger when the user is unsure how to communicate a research problem or a setback to their advisor.
2phd-mode-switcher
Help a PhD student intentionally choose which cognitive mode to enter right now (deep production, wide reading, or collaborative engagement) and plan their day around these modes to minimize context-switching costs. Use this skill whenever the user is at the start of a day or work block and unsure what to focus on, feels scattered across too many activities, asks "what should I do right now", wants to plan their day, or feels frustrated by constant context-switching. Trigger on phrases like "plan my day", "what should I work on now", "I feel scattered", "context switching", "deep work", "can't focus", "I have X hours", or whenever the user is trying to decide between substantively different kinds of work (writing vs reading vs meetings).
2phd-quarterly-planner
Help a PhD student set and revise a realistic 3-month research plan that connects to their longer-term goals. Use this skill whenever the user wants to plan a new quarter, review how the last quarter went, set research goals for the next few months, think about what papers to aim for, or reorient after things drift off course. Trigger on phrases like "quarterly plan", "next 3 months", "plan my quarter", "what should I work on this quarter", "review last quarter", "my research goals", or whenever the user talks about mid-range planning (longer than a week, shorter than a year). Also trigger if the user is feeling directionless about what to focus on next.
2research-mental-check
Offer a structured but non-clinical space for a PhD student or researcher to check in on their mental and emotional state, especially around imposter syndrome, guilt about rest, chronic over-promising, and burnout signals. Use this skill when the user expresses feelings of inadequacy, constant comparison to peers, fear of disappointing their advisor, guilt about taking time off, or exhaustion that isn't just physical. Trigger on phrases like "I feel behind", "everyone is smarter than me", "I can't rest", "I'm burned out", "imposter syndrome", "I'm not good enough", "I'm afraid of disappointing", "I should be working", or whenever the tone of the user's message suggests emotional strain rather than a technical question. Also trigger gently if these signals appear incidentally in a task-focused conversation.
2figure-results-review
Review experimental results, plots, tables, and figures before they are shown in a meeting, paper, report, or presentation. Use this skill whenever the user wants to present results, check a figure, interpret experiment plots, prepare result slides, validate captions, audit axes/legends/error bars, or make sure results are connected to hypotheses and experimental setup.
2phd-weekly-review
Guide a PhD student through a structured weekly review of their research progress. Use this skill whenever the user wants to do a weekly check-in, prepare a progress update for their advisor, reflect on the past week's research, or plan the upcoming week. Trigger on phrases like "weekly review", "this week's progress", "advisor update", "reflect on my week", "plan next week", "how did my week go", or whenever the user mentions wanting to take stock of their recent research work. Also trigger when the user seems to be venting about the week without structure — help them channel it into a productive review.
2