ln-814-optimization-executor
Paths: File paths (
shared/,references/,../ln-*) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.
ln-814-optimization-executor
Type: L3 Worker Category: 8XX Optimization
Executes optimization hypotheses from the researcher using keep/discard autoresearch loop. Supports multi-file changes, compound baselines, and any optimization type (algorithm, architecture, query, caching, batching).
Overview
| Aspect | Details |
|---|---|
| Input | .optimization/{slug}/context.md OR conversation context (standalone invocation) |
| Output | Optimized code on isolated branch, per-hypothesis results, experiment log |
| Pattern | Strike-first: apply all → test → measure. Bisect only on failure. A/B only for contested alternatives |
Workflow
Phases: Pre-flight → Baseline → Strike-First Execution → Report → Gap Analysis
Phase 0: Pre-flight Checks
Slug Resolution
- If invoked via Agent with contextStore containing
slug— use directly. - If invoked standalone — derive slug from context_file path or ask user.
Step 1: Load Context
Read .optimization/{slug}/context.md from project root. Contains problem statement, profiling results, research hypotheses, and target metric.
If file not found: check conversation context for the same data (standalone invocation).
Step 2: Pre-flight Validation
| Check | Required | Action if Missing |
|---|---|---|
| Hypotheses provided (H1..H7) | Yes | Block — nothing to execute |
| Test infrastructure | Yes | Block (see ci_tool_detection.md) |
| Git clean state | Yes | Block (need clean baseline for revert) |
| Worktree isolation | Yes | Create per git_worktree_fallback.md |
| E2E safety test | No (recommended) | Read from context; WARN if null — full test suite as fallback gate |
MANDATORY READ: Load shared/references/git_worktree_fallback.md — use optimization rows.
MANDATORY READ: Load shared/references/ci_tool_detection.md — use Test Frameworks + Benchmarks sections.
E2E Safety Test
Read e2e_test_command from context file (discovered by profiler during test discovery phase).
| Source | Action |
|---|---|
Context has e2e_test_command |
Use as functional safety gate in Phase 2 |
Context has e2e_test_command = null |
WARN: full test suite serves as fallback gate |
| Standalone (no context) | User must provide test command; block if missing |
Phase 1: Establish Baseline
Reuse baseline from performance map (already measured with real metrics).
From Context File
Read performance_map.baseline and performance_map.test_command from .optimization/{slug}/context.md.
| Field | Source |
|---|---|
test_command |
Discovered/created test command |
baseline |
Multi-metric snapshot: wall time, CPU, memory, I/O |
Verification Run
Run test_command once to confirm baseline is still valid (code unchanged since profiling):
| Step | Action |
|---|---|
| 1 | Run test_command |
| 2 | IF result within 10% of baseline.wall_time_ms → baseline confirmed |
| 3 | IF result diverges > 10% → re-measure (3 runs, median) as new baseline |
| 4 | IF test FAILS → BLOCK: "test fails on unmodified code" |
Phase 2: Strike-First Execution
MANDATORY READ: Load optimization_categories.md for pattern reference during implementation.
Apply maximum changes at once. Only fall back to A/B testing where sources genuinely disagree on approach.
Step 1: Triage Hypotheses
Split hypotheses from researcher into two groups:
| Group | Criteria | Action |
|---|---|---|
| Uncontested | Clear best approach, no conflicting alternatives | Apply directly in the strike |
| Contested | Multiple approaches exist (e.g., source A says cache, source B says batch) OR conflicts_with another hypothesis |
A/B test each alternative on top of full implementation |
Most hypotheses should be uncontested — the researcher already ranked them by evidence.
Step 2: Strike (Apply All Uncontested)
1. APPLY all uncontested hypotheses at once (all file edits)
2. VERIFY: Run full test suite
IF tests FAIL:
- IF fixable (typo, missing import) → fix & re-run ONCE
- IF fundamental → BISECT (see Step 4)
3. E2E GATE (if e2e_test_command not null):
IF FAIL → BISECT
4. MEASURE: 5 runs, median
5. COMPARE: improvement vs baseline
IF improvement meets target → DONE. Commit all:
git add {all_files}
git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
IF no improvement → BISECT
Step 3: Contested Alternatives (A/B on top of strike)
For each contested pair/group, with ALL uncontested changes already applied:
FOR each contested hypothesis group:
1. Apply alternative A → test → measure (5 runs, median)
2. Revert alternative A, apply alternative B → test → measure
3. KEEP the winner. Commit.
4. Winner becomes part of the baseline for next contested group.
Step 4: Bisect (only on strike failure)
If strike fails tests or shows no improvement:
1. Revert all changes: git checkout -- . && git clean -fd
2. Binary search: apply first half of hypotheses → test
- IF passes → problem in second half
- IF fails → problem in first half
3. Narrow down to the breaking hypothesis
4. Remove it from strike, re-apply remaining → test → measure
5. Log removed hypothesis with reason
Scope Rules
| Rule | Description |
|---|---|
| File scope | Multiple files allowed (not limited to single function) |
| Signature changes | Allowed if tests still pass |
| New files | Allowed (cache wrapper, batch adapter, utility) |
| New dependencies | Allowed if already in project ecosystem (e.g., using configured Redis) |
| Time budget | 45 minutes total |
Revert Protocol
| Scope | Command |
|---|---|
| Full revert | git checkout -- . && git clean -fd (safe in worktree) |
| Single hypothesis | git checkout -- {files} (only during bisect) |
Safety Rules
| Rule | Description |
|---|---|
| Traceability | Commit message lists all applied hypothesis IDs |
| Isolation | All work in isolated worktree; never modify main worktree |
| Bisect only on failure | Do NOT test hypotheses individually unless strike fails or alternatives genuinely conflict |
| Crash triage | Runtime crash → fix once if trivial (typo, import), else bisect to find cause |
Phase 3: Report Results
Report Schema
| Field | Description |
|---|---|
| baseline | Original measurement (metric + value) |
| final | Final measurement after optimizations |
| total_improvement_pct | Overall percentage improvement |
| target_met | Boolean — did we reach the target metric? |
| strike_result | clean (all applied) / bisected (some removed) / failed |
| hypotheses_applied | List of hypothesis IDs applied in strike |
| hypotheses_removed | List removed during bisect (with reasons) |
| contested_results | Per-contested group: alternatives tested, winner, measurement |
| branch | Worktree branch name |
| files_modified | All changed files |
| e2e_test | { command, source, baseline_passed, final_passed } or null |
Results Comparison (mandatory)
Show baseline vs final for EVERY metric from performance_map.baseline. Include both percentage and multiplier.
| Metric | Baseline | After Strike | Improvement |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |
Target: 5000ms → Achieved: 3800ms ✓ TARGET MET
Per-Function Delta (if instrumentation available)
If instrumented_files from context is non-empty, run test_command once more AFTER strike to capture per-function timing with the same instrumentation the profiler placed:
| Function | Before (ms) | After (ms) | Delta |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (unchanged) |
Then clean up: git checkout -- {instrumented_files} — remove all profiling instrumentation before final commit.
Present both tables to user. This is the primary deliverable — numbers the user sees first.
Experiment Log
Write to {project_root}/.optimization/{slug}/ln-814-log.tsv:
| Column | Description |
|---|---|
| timestamp | ISO 8601 |
| phase | strike / bisect / contested |
| hypotheses | Comma-separated IDs applied in this round |
| baseline_ms | Baseline before this round |
| result_ms | Measurement after changes |
| improvement_pct | Percentage change |
| status | applied / removed / alternative_a / alternative_b |
| commit | Git commit hash |
| files | Comma-separated modified files |
| e2e_status | pass / fail / skipped |
Append to existing file if present (enables tracking across multiple runs).
Phase 4: Gap Analysis (If Target Not Met)
If target metric not reached after all hypotheses:
| Section | Content |
|---|---|
| Achievement | What was achieved (original → final, improvement %) |
| Remaining bottlenecks | From time map: which steps still dominate |
| Remaining cycles | If coordinator runs multi-cycle: "{remaining} optimization cycles available for remaining bottlenecks" |
| Infrastructure recommendations | If bottleneck requires infra changes (scaling, caching layer, CDN) |
| Further research | Optimization directions not explored in this run |
Error Handling
| Error | Recovery |
|---|---|
| Strike fails all tests | Bisect to find breaking hypothesis, remove it, retry |
| Strike shows no improvement | Bisect to identify ineffective hypotheses |
| Measurement inconsistent (high variance) | Increase runs to 10, use median |
| Worktree creation fails | Fall back to branch per git_worktree_fallback.md |
| Time budget exceeded | Stop loop, report partial results with hypotheses remaining |
| Multi-file revert fails | git checkout -- . in worktree (safe — worktree is isolated) |
References
- optimization_categories.md — optimization pattern checklist
shared/references/ci_tool_detection.md(test + benchmark detection)shared/references/git_worktree_fallback.md(worktree isolation)
Definition of Done
- Baseline established using same metric type as observed problem
- Hypotheses triaged: uncontested vs contested
- Strike applied: all uncontested hypotheses implemented at once
- Tests pass after strike
- Contested alternatives A/B tested on top of full implementation
- Bisect performed only if strike fails (not preemptively)
- E2E safety test passes (or documented as unavailable)
- Experiment log written to
.optimization/{slug}/ln-814-log.tsv - Report returned with baseline, final, improvement%, strike result
- All changes on isolated branch, pushed to remote
- Gap analysis provided if target metric not met
Version: 2.0.0 Last Updated: 2026-03-14