mkt-experiment
Experiment Design
Solution Track — Step 4 of 4. Designs minimum viable tests with clear success/iterate/kill rules.
Inputs Required
- An initiative with hypothesis and target metric (from
.agents/mkt/prioritized-initiatives.md+.agents/mkt/targets.md) - OR: User describes what they want to test
Output
.agents/mkt/experiment-[name].md
Quality Gate
Before delivering, verify:
- Executable in ≤2 weeks
- Costs <10% of full initiative budget (or $0 for internal-only changes)
- Success threshold is a specific number (not "significant improvement" or "we'll know it when we see it")
- Sample size is sufficient (checked against table below) — or test type adjusted if insufficient
Chain Position
Previous: mkt-funnel-planner | Next: Scale (if success) / mkt-initiative-generate (if kill)
Before Starting
Step 0: Product Context
Check for .agents/mkt/product-context.md. If missing: INTERVIEW. Ask the user 8 product questions (what, who, problem, differentiator, proof points, pricing, objections, voice) and save to .agents/mkt/product-context.md. Or recommend running mkt-copywriting to bootstrap it.
Required Artifacts
| Artifact | Source | If Missing |
|---|---|---|
prioritized-initiatives.md |
mkt-initiative-prioritize | INTERVIEW. Ask what to test. |
targets.md |
mkt-funnel-planner | INTERVIEW. Ask for baseline metrics. |
Optional Artifacts
| Artifact | Source | Benefit |
|---|---|---|
product-context.md |
mkt-copywriting | Test design context |
Read .agents/mkt/prioritized-initiatives.md and .agents/mkt/targets.md if they exist. If not, interview the user:
- "What are you trying to test? What's your hypothesis?"
- "What metric are you measuring? What's the current baseline?"
- "What would success look like? What would make you stop?"
You need at minimum: a hypothesis, a target metric, and a baseline number.
Step 1: Choose Test Type
| Type | When | Rigor |
|---|---|---|
| A/B Test | Can split traffic randomly; enough volume | High — controls for time |
| Before-After | Can't split (infrastructure change, whole-site change) | Medium — time effects confound |
| Cohort | Testing on a specific segment first | Medium — segment differences |
| Pilot | New capability; need qualitative signal first | Low — directional only |
If unsure: check sample size (Step 3). Not enough volume for A/B in 2 weeks → use Before-After or Pilot.
Step 2: Define Decision Rules
Be specific. Vague thresholds → endless "let's give it one more week."
| Outcome | Threshold | Action |
|---|---|---|
| Success | [metric] ≥ [number] | Scale: roll out fully |
| Iterate | [metric] between [X] and [Y] | Change ONE variable: [specify which]. Rerun for [duration]. Max 2 iterations. |
| Kill | [metric] < [number] after [full duration] | Archive learnings. Return to mkt-initiative-generate. |
Step 3: Sample Size Check
For A/B tests — minimum visitors PER VARIANT:
| Baseline Rate | 20% Relative Lift | 50% Lift |
|---|---|---|
| 1% | ~40,000 | ~6,400 |
| 5% | ~8,000 | ~1,300 |
| 10% | ~4,000 | ~640 |
| 25% | ~1,600 | ~260 |
Quick math: [Your daily traffic] × [test days] ÷ 2 = visitors per variant. Compare against table.
If insufficient for A/B: use Before-After, or make a bigger change (aim for 50%+ lift instead of 20%).
See references/sample-size-guide.md for full tables.
Artifact Template
On re-run: rename existing artifact to experiment-[name].v[N].md and create new with incremented version.
---
skill: mkt-experiment
version: 1
date: {{today}}
status: draft
---
# Experiment: [Name]
**Initiative:** [from prioritized list]
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If [action], then [metric] changes from [baseline] to [target], because [reason] |
| Test Type | A/B / Before-After / Cohort / Pilot |
| Duration | [≤2 weeks] |
| Budget | [$X or $0] |
| Primary Metric | [metric name] |
| Baseline | [current value] |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | [metric] ≥ [X] | [Scale how] |
| Iterate | [metric] between [X-Y] | Change: [one specific variable]. Rerun: [duration]. |
| Kill | [metric] < [X] after [duration] | Archive. Next: [what to try instead] |
## Sample Size Check
- Daily traffic: [N]
- Test duration: [days]
- Visitors per variant: [N × days ÷ 2]
- Required for [X%] lift at baseline [Y%]: [from table]
- **Sufficient?** Yes / No → [if no, adjusted approach]
## Next Step
[When to check results. What to do with each outcome.]
Worked Example 1: A/B Test
# Experiment: Paid Targeting Lookalike Test
**Date:** 2026-03-13
**Skill:** mkt-experiment
**Initiative:** Restore + Refine Paid Targeting
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If we use a 1% converters lookalike audience, then paid signup rate increases from 1.2% to ≥3%, because the lookalike matches our ICP better than current broad targeting |
| Test Type | A/B — split paid budget 50/50 between current and lookalike |
| Duration | 10 days |
| Budget | $500 (redirecting existing $50/day spend) |
| Primary Metric | Paid visitor-to-signup rate |
| Baseline | 1.2% |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | ≥2.5% signup rate | Roll out lookalike to 100% of paid budget |
| Iterate | 1.5-2.5% | Narrow lookalike from 1% to 0.5%, add interest-based exclusions. Rerun 10 days. |
| Kill | <1.5% after 10 days | Targeting isn't the primary issue. Test homepage messaging next. |
## Sample Size Check
- Daily paid traffic: ~300 visitors
- 10 days: 3,000 total → 1,500 per variant
- At 1.2% baseline, need ~1,300 per variant for 50% lift detection
- **Sufficient?** Yes — 1,500 > 1,300. Can detect a 50%+ relative lift (1.2% → 1.8%+).
Worked Example 2: Before-After
# Experiment: Homepage Social Proof Restoration
**Date:** 2026-03-13
**Skill:** mkt-experiment
**Initiative:** Restore Homepage Social Proof
## Design
| Field | Value |
|-------|-------|
| Hypothesis | If we restore customer logos + testimonial to homepage hero, then bounce rate drops from 52% to ≤42%, because the redesign removed trust signals that reduced visitor confidence |
| Test Type | Before-After — change affects all visitors |
| Duration | 7 days after deployment |
| Budget | $0 (copy/design work only) |
| Primary Metric | Homepage bounce rate |
| Baseline | 52% |
## Decision Rules
| Outcome | Threshold | Action |
|---------|-----------|--------|
| Success | Bounce ≤ 42% | Keep changes. Explore further trust signal additions. |
| Iterate | 42-48% bounce | Add more proof: case study snippet, user count, or G2 rating badge. Rerun 7 days. |
| Kill | >48% after 7 days | Social proof alone isn't enough. Run full Homepage A/B (old vs new) from parked list. |
## Sample Size Check
- Daily homepage visitors: ~800
- 7 days: 5,600 visitors
- Before-After with 52% baseline: 5,600 is ample to detect a 10-point bounce rate change
- **Sufficient?** Yes
Loop-Back Protocol
When an experiment concludes, follow the appropriate path:
Kill Decision
- Append learnings to the experiment file under
## Post-Mortem - Check: Is the root cause (
.agents/mkt/root-cause.md) still valid?- Yes → Route to S1: Run
mkt-initiative-generatewith constraint: "[Failed approach] did not work because [reason]. Generate alternatives that avoid this failure mode." - No → Route to P1: Problem may have shifted. Run
mkt-diagnosisto redefine.
- Yes → Route to S1: Run
- Update
.agents/mkt/prioritized-initiatives.md— move killed initiative to "Kill" status with reason.
Iterate Decision
- Change exactly ONE variable from the original test
- Maximum 2 iterations allowed before triggering Kill protocol
- Document iteration rationale in experiment file under
## Iterations
Scale Decision
- Document final results in experiment file under
## Results - Move to full execution — no further testing needed for this initiative
- Run
mkt-attributionto map the scaled initiative to KPIs
After the Experiment
| Result | Do This |
|---|---|
| Success | Scale. Set full targets via mkt-funnel-planner. Celebrate briefly, then test the next initiative. |
| Iterate | Change ONE variable. Document what and why. Max 2 iterations before escalating to kill/pivot. |
| Kill | Archive in the experiment file: what you tested, what happened, what you learned. Return to mkt-initiative-generate with the new learning. |
References
- references/experiment-templates.md — Templates for landing page, email, ad creative, feature flag tests
- references/sample-size-guide.md — Full sample size tables and common mistakes
More from hungv47/agent-skills
mkt-copywriting
This skill should be used when the user asks to "write copy", "create a headline", "write a tagline", "improve my copy", "write landing page copy", "create ad copy", "write email subject lines", "write product description", or mentions copywriting, persuasive writing, marketing copy, or conversion copy. Uses Harry Dry's Marketing Examples framework.
19tool-plan-interviewer
This skill should be used when the agent enters plan mode, when the user mentions "plan file", "spec file", "interview me about the plan", "help me think through this", or when starting implementation planning. Conducts in-depth, multi-round interviews using AskUserQuestion to surface non-obvious requirements, edge cases, and tradeoffs before writing specifications.
14design-user-flow
This skill should be used when the user asks to "create a user flow", "map navigation", "design user journey", "create wireflow", "plan screen transitions", "map the app flow", "design navigation", or mentions user flow, wireflow, navigation map, user journey, screen transitions, or pre-interface design research. Creates user flow diagrams and wireflows for digital products.
14eng-system-architecture
This skill should be used when the user asks to "design system architecture", "plan the tech stack", "create database schema", "design API structure", "architect the system", "plan the backend", "design infrastructure", or mentions system architecture, technical architecture, database design, API design, or tech stack selection. Transforms product documentation into comprehensive technical blueprints.
13mkt-diagnosis
This skill should be used when the user asks to "diagnose the problem", "break down the issue", "build a logic tree", "what's causing this", or mentions problem diagnosis, logic trees, MECE decomposition, or structured problem solving. This is the ENTRY POINT for problem analysis — use when no diagnosis exists yet.
13mkt-lp-optimization
This skill should be used when the user asks to "optimize a landing page", "improve conversion rates", "review my landing page", "create a high-converting landing page", "audit my LP", "reduce bounce rate", or mentions landing page design, conversion optimization, A/B testing strategy, or CTA optimization.
13