mkt-experiment
Experiment Design
Solution Track — Step 4 of 4. Designs minimum viable tests with clear success/iterate/kill rules.
Inputs Required
- An initiative with hypothesis and target metric (from
.agents/mkt/prioritized-initiatives.md+.agents/mkt/targets.md) - OR: User describes what they want to test
Output
.agents/mkt/experiment-[name].md
Quality Gate
Before delivering, verify:
- Executable in ≤2 weeks
- Costs <10% of full initiative budget (or $0 for internal-only changes)
- Success threshold is a specific number (not "significant improvement" or "we'll know it when we see it")
- Sample size is sufficient (checked against table below) — or test type adjusted if insufficient
Chain Position
Previous: mkt-funnel-planner | Next: Scale (if success) / mkt-initiative-generate (if kill)
Before Starting
Step 0: Product Context
Check for .agents/mkt/product-context.md. If missing: INTERVIEW. Ask the user 8 product questions (what, who, problem, differentiator, proof points, pricing, objections, voice) and save to .agents/mkt/product-context.md. Or recommend running mkt-copywriting to bootstrap it.
Required Artifacts
| Artifact | Source | If Missing |
|---|---|---|
prioritized-initiatives.md |
mkt-initiative-prioritize | INTERVIEW. Ask what to test. |
targets.md |
mkt-funnel-planner | INTERVIEW. Ask for baseline metrics. |
Optional Artifacts
| Artifact | Source | Benefit |
|---|---|---|
product-context.md |
mkt-copywriting | Test design context |
Read .agents/mkt/prioritized-initiatives.md and .agents/mkt/targets.md if they exist. If not, interview the user:
- "What are you trying to test? What's your hypothesis?"
- "What metric are you measuring? What's the current baseline?"
- "What would success look like? What would make you stop?"
You need at minimum: a hypothesis, a target metric, and a baseline number.
Step 1: Choose Test Type
| Type | When | Rigor |
|---|---|---|
| A/B Test | Can split traffic randomly; enough volume | High — controls for time |
| Before-After | Can't split (infrastructure change, whole-site change) | Medium — time effects confound |
| Cohort | Testing on a specific segment first | Medium — segment differences |
| Pilot | New capability; need qualitative signal first | Low — directional only |
If unsure: check sample size (Step 3). Not enough volume for A/B in 2 weeks → use Before-After or Pilot.
Step 2: Define Decision Rules
Be specific. Vague thresholds → endless "let's give it one more week."
| Outcome | Threshold | Action |
|---|---|---|
| Success | [metric] ≥ [number] | Scale: roll out fully |
| Iterate | [metric] between [X] and [Y] | Change ONE variable: [specify which]. Rerun for [duration]. Max 2 iterations. |
| Kill | [metric] < [number] after [full duration] | Archive learnings. Return to mkt-initiative-generate. |
Step 3: Sample Size Check
For A/B tests — minimum visitors PER VARIANT:
| Baseline Rate | 20% Relative Lift | 50% Lift |
|---|---|---|
| 1% | ~40,000 | ~6,400 |
| 5% | ~8,000 | ~1,300 |
| 10% | ~4,000 | ~640 |
| 25% | ~1,600 | ~260 |
Quick math: [Your daily traffic] × [test days] ÷ 2 = visitors per variant. Compare against table.
If insufficient for A/B: use Before-After, or make a bigger change (aim for 50%+ lift instead of 20%).
See references/sample-size-guide.md for full tables.
Artifact Template
On re-run: rename existing artifact to experiment-[name].v[N].md and create new with incremented version.
---
skill: mkt-experiment
version: 1
date: {{today}}
status: draft
---
# Experiment: [Name]
**Initiative:** [from prioritized list]
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If [action], then [metric] changes from [baseline] to [target], because [reason] |
| Test Type | A/B / Before-After / Cohort / Pilot |
| Duration | [≤2 weeks] |
| Budget | [$X or $0] |
| Primary Metric | [metric name] |
| Baseline | [current value] |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | [metric] ≥ [X] | [Scale how] |
| Iterate | [metric] between [X-Y] | Change: [one specific variable]. Rerun: [duration]. |
| Kill | [metric] < [X] after [duration] | Archive. Next: [what to try instead] |
## Sample Size Check
- Daily traffic: [N]
- Test duration: [days]
- Visitors per variant: [N × days ÷ 2]
- Required for [X%] lift at baseline [Y%]: [from table]
- **Sufficient?** Yes / No → [if no, adjusted approach]
## Next Step
[When to check results. What to do with each outcome.]
Worked Example 1: A/B Test
# Experiment: Paid Targeting Lookalike Test
**Date:** 2026-03-13
**Skill:** mkt-experiment
**Initiative:** Restore + Refine Paid Targeting
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If we use a 1% converters lookalike audience, then paid signup rate increases from 1.2% to ≥3%, because the lookalike matches our ICP better than current broad targeting |
| Test Type | A/B — split paid budget 50/50 between current and lookalike |
| Duration | 10 days |
| Budget | $500 (redirecting existing $50/day spend) |
| Primary Metric | Paid visitor-to-signup rate |
| Baseline | 1.2% |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | ≥2.5% signup rate | Roll out lookalike to 100% of paid budget |
| Iterate | 1.5-2.5% | Narrow lookalike from 1% to 0.5%, add interest-based exclusions. Rerun 10 days. |
| Kill | <1.5% after 10 days | Targeting isn't the primary issue. Test homepage messaging next. |
## Sample Size Check
- Daily paid traffic: ~300 visitors
- 10 days: 3,000 total → 1,500 per variant
- At 1.2% baseline, need ~1,300 per variant for 50% lift detection
- **Sufficient?** Yes — 1,500 > 1,300. Can detect a 50%+ relative lift (1.2% → 1.8%+).
Worked Example 2: Before-After
# Experiment: Homepage Social Proof Restoration
**Date:** 2026-03-13
**Skill:** mkt-experiment
**Initiative:** Restore Homepage Social Proof
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If we restore customer logos + testimonial to homepage hero, then bounce rate drops from 52% to ≤42%, because the redesign removed trust signals that reduced visitor confidence |
| Test Type | Before-After — change affects all visitors |
| Duration | 7 days after deployment |
| Budget | $0 (copy/design work only) |
| Primary Metric | Homepage bounce rate |
| Baseline | 52% |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | Bounce ≤ 42% | Keep changes. Explore further trust signal additions. |
| Iterate | 42-48% bounce | Add more proof: case study snippet, user count, or G2 rating badge. Rerun 7 days. |
| Kill | >48% after 7 days | Social proof alone isn't enough. Run full Homepage A/B (old vs new) from parked list. |
## Sample Size Check
- Daily homepage visitors: ~800
- 7 days: 5,600 visitors
- Before-After with 52% baseline: 5,600 is ample to detect a 10-point bounce rate change
- **Sufficient?** Yes
Loop-Back Protocol
When an experiment concludes, follow the appropriate path:
Kill Decision
- Append learnings to the experiment file under
## Post-Mortem - Check: Is the root cause (
.agents/mkt/root-cause.md) still valid?- Yes → Route to S1: Run
mkt-initiative-generatewith constraint: "[Failed approach] did not work because [reason]. Generate alternatives that avoid this failure mode." - No → Route to P1: Problem may have shifted. Run
mkt-diagnosisto redefine.
- Yes → Route to S1: Run
- Update
.agents/mkt/prioritized-initiatives.md— move killed initiative to "Kill" status with reason.
Iterate Decision
- Change exactly ONE variable from the original test
- Maximum 2 iterations allowed before triggering Kill protocol
- Document iteration rationale in experiment file under
## Iterations
Scale Decision
- Document final results in experiment file under
## Results - Move to full execution — no further testing needed for this initiative
- Run
mkt-attributionto map the scaled initiative to KPIs
After the Experiment
| Result | Do This |
|---|---|
| Success | Scale. Set full targets via mkt-funnel-planner. Celebrate briefly, then test the next initiative. |
| Iterate | Change ONE variable. Document what and why. Max 2 iterations before escalating to kill/pivot. |
| Kill | Archive in the experiment file: what you tested, what happened, what you learned. Return to mkt-initiative-generate with the new learning. |
References
- references/experiment-templates.md — Templates for landing page, email, ad creative, feature flag tests
- references/sample-size-guide.md — Full sample size tables and common mistakes