experiment-design
Experiment Design
Design rigorous machine learning experiments that produce credible, reproducible results.
Design Checklist
Work through each section before writing any code.
1. Hypothesis
State the hypothesis as a falsifiable claim:
"We claim that [method X] achieves [metric Y] on [dataset Z] because [mechanism]."
If the hypothesis is vague, help the user sharpen it before proceeding.
2. Independent and Dependent Variables
- Independent variable: What is being changed (e.g., architecture, loss function, data augmentation)?
- Dependent variable: What is being measured (e.g., accuracy, FID score, wall-clock time)?
- Controlled variables: List everything held constant.
3. Baselines
Select baselines at three levels:
- Naive: A trivially simple method (majority class, mean predictor)
- Standard: The most widely-used existing approach
- Strong: The current state-of-the-art on the chosen benchmark
Justify each choice. Avoid strawman baselines.
4. Datasets and Splits
- Name the dataset, version, and source.
- Specify train/val/test splits. Use standard splits if they exist.
- Flag any data leakage risks.
- Note dataset limitations (bias, domain coverage, size).
5. Metrics
- Choose metrics that align with the task objective.
- Prefer metrics with established semantics over novel ones.
- Report multiple metrics when they capture different aspects.
- Specify statistical significance: report means ± standard deviation over N seeds.
6. Compute Budget
State the hardware, estimated runtime, and number of seeds. This enables reproducibility and contextualizes cost.
7. Ablations
Design ablations that isolate each component's contribution. Each ablation should remove or replace exactly one thing.
8. Failure Modes
Identify at least two ways the experiment could give misleading results, and how to detect or mitigate them.
Output Format
Produce a structured experiment plan as a markdown document with all sections above filled in. Highlight any section where the user needs to make a decision before proceeding.
More from aviskaar/open-org
cfo-finance
Use this skill when a CFO, VP Finance, Controller, or Head of Finance needs to orchestrate the full financial operations of a company — from strategic financial planning and investor reporting to day-to-day control of accounts payable, accounts receivable, payroll, tax compliance, and revenue operations. This is the top-level financial orchestrator that commissions all finance sub-skills, maintains the single source of truth for all company numbers, drives budget allocation, manages cash flow, ensures regulatory compliance, and produces board-ready financial reports. Trigger this skill when anyone needs a comprehensive view of company finances, a board pack, a fundraising data room, or needs to coordinate across invoicing, payroll, commissions, procurement, taxes, and expenses simultaneously.
48payroll-compensation
Use this skill when a VP Payroll, Head of People Operations, or Payroll Manager needs to manage all employee and contractor compensation flows — including payroll runs, salary administration, statutory deductions, benefits administration, equity grants and vesting, variable pay bonuses, contractor invoice processing, and full payroll compliance across jurisdictions. This skill orchestrates the salary management sub-skill. Trigger when running payroll, onboarding employees with compensation packages, processing salary changes, calculating bonuses, managing equity schedules, processing contractor payments, handling payroll tax filings, or producing total compensation reports for People and Finance leadership.
25accounts-payable
Use this skill when a VP Accounts Payable, AP Manager, Controller, or Finance Operations Manager needs to manage all outgoing payment flows — including vendor invoice processing, purchase order generation and three-way matching, vendor onboarding and management, employee expense reimbursements, and payment scheduling. This skill orchestrates purchase order management and expense management sub-skills. Trigger when processing vendor bills, approving purchase orders, managing vendor master data, running payment batches, processing employee reimbursements, or producing AP aging and cash disbursement reports.
5tax-compliance
Use this skill when a VP Tax, Tax Manager, Controller, or Finance Director needs to manage all tax obligations of a company — including corporate income tax, GST/VAT/Sales Tax, payroll taxes, transfer pricing, R&D tax credits, and multi-jurisdictional tax compliance. Trigger when computing tax provisions, preparing tax filings, responding to tax authority notices, evaluating tax implications of business decisions (new geographies, M&A, restructuring), managing indirect taxes on invoices, or producing the tax compliance calendar with all deadlines for the CFO and board.
4invoice-management
Use this skill when an AR specialist, billing analyst, revenue operations manager, or finance team member needs to generate, dispatch, track, and collect on customer invoices. Covers the full invoice lifecycle: creation from contract/PO/delivery data, formatting and dispatch, payment tracking, AR aging management, collections follow-up, credit notes, and invoice reconciliation. Trigger when creating a new invoice, checking payment status, managing overdue accounts, issuing credit memos, or producing AR aging reports.
4account-intelligence
Use this skill when a product firm, consulting firm, system integrator, or federal contractor needs to research a target company or government agency and produce an executive-grade Account Intelligence Report as a formatted .docx file. Handles any industry vertical — Life Sciences, Financial Services, Healthcare, Manufacturing, Energy, Retail, Technology, Federal/Government, and more. Fully automates the pursuit research and document generation process. Includes AI Agentic Solutions vision, IP and Research Opportunity mapping, and high-definition charts and visual dashboards.
3