openclaw-test-performance
Installation
SKILL.md
OpenClaw Test Performance
Use evidence first. The goal is real pnpm test speed/RSS improvement with
coverage intact, not runner tuning by guesswork.
Workflow
- Read the relevant local
AGENTS.mdfiles before editing:src/agents/AGENTS.mdfor agent/import hotspots.src/channels/AGENTS.mdandsrc/plugins/AGENTS.mdfor plugin/channel laziness.src/gateway/AGENTS.mdfor server lifecycle tests.test/helpers/AGENTS.mdandtest/helpers/channels/AGENTS.mdfor shared contract helpers.src/infra/outbound/AGENTS.mdfor outbound/media/action tests.
- Establish a baseline before changing code:
- Prefer
pnpm test:perf:groups --full-suite --allow-failures --output <file>for full-suite ranking. - For a scoped hotspot use:
/usr/bin/time -l pnpm test <file-or-files> --maxWorkers=1 --reporter=verbose - For import-heavy suspicion add:
OPENCLAW_VITEST_IMPORT_DURATIONS=1 OPENCLAW_VITEST_PRINT_IMPORT_BREAKDOWN=1.
- Prefer
- Separate wall/runner noise from real file cost:
- Compare Vitest duration, test body timing, import breakdown, wall time, and max RSS.
- Re-run single files when grouped/full-suite numbers look stale or noisy.
- If a full-suite grouped run reports a lane failure but JSON says tests passed, capture that as harness/noise and verify the suspect file directly.
- Pick the next attack by return and risk:
- High return: one file/test dominates seconds or RSS and has a clear root.
- Lower risk: static descriptors, target parsing, routing, auth bypass, setup hints, registry fixtures, or test server lifecycle.
- Higher risk: real memory/runtime behavior, live providers, protocol contracts, or broad production refactors.
- Fix the root cause, not the symptom:
- Move static metadata/parsing into narrow helpers or lightweight artifacts reused by full runtime and fast paths.
- Prefer dependency injection, loaded-plugin-only lookup, explicit fixtures, and pure helpers over broad mocks.
- Reuse suite-level servers/clients when a fresh handshake is irrelevant.
- Keep schedulers/background loops off unless the test proves scheduling.
- Preserve coverage shape:
- Do not delete a slow integration proof unless the exact production composition is extracted into a named helper and tested.
- Keep one cheap integration smoke when cross-component wiring matters.
- State explicitly what incidental coverage was removed, if any.
- Re-benchmark the same command after the change and compute seconds plus percent gain.
- Update the running report when requested or when this thread is tracking one. Include before/after commands, artifacts, coverage notes, verification, and next attack order.
- Commit with
scripts/committer "<message>" <paths...>and push when the user asked for commits/pushes. Stage only files touched for this attack.
Common Root Causes
- Full bundled channel/plugin runtime loaded for static data.
getChannelPlugin()fallback used when an already-loaded fixture or pure parser would suffice.- Broad
api.ts,runtime-api.ts,test-api.ts, or plugin-sdk barrels pulled into hot tests. - Partial-real mocks using
importActual()around broad modules. vi.resetModules()plus fresh imports in per-test loops.- Test plugin registry seeded in
beforeAllwhile runtime state resets inafterEach. - Per-test gateway/server/client startup when state reset would suffice.
- Runtime/default model/auth selection paid by idle snapshots or fixtures.
- Plugin-owned media/action discovery triggered before checking whether args contain plugin-owned fields.
Benchmark Commands
Scoped file:
timeout 240 /usr/bin/time -l pnpm test <file> --maxWorkers=1 --reporter=verbose
Scoped file with import breakdown:
timeout 240 /usr/bin/time -l env \
OPENCLAW_VITEST_IMPORT_DURATIONS=1 \
OPENCLAW_VITEST_PRINT_IMPORT_BREAKDOWN=1 \
pnpm test <file> --maxWorkers=1 --reporter=verbose
Grouped suite:
pnpm test:perf:groups --full-suite --allow-failures \
--output .artifacts/test-perf/<name>.json
Reuse an existing Vitest JSON report:
pnpm test:perf:groups --report <vitest-json> \
--output .artifacts/test-perf/<name>.json
Verification
- Always run the targeted test surface that proves the change.
- Run
pnpm checkbefore commit unless the change is docs-only and the hook handles it. - Run
pnpm buildwhen touching lazy-loading, bundled artifacts, package boundaries, dynamic imports, build output, or public surfaces. - If deps are missing/stale, run
pnpm installand retry the exact failed command once. - Use the report format:
| Metric | Before | After | Gain |
| -------------- | -----: | ----: | ------------: |
| File wall time | `Xs` | `Ys` | `-Zs` (`P%`) |
| Max RSS | `XMB` | `YMB` | `-ZMB` (`P%`) |
Handoff
Keep the final concise:
- Root cause.
- Files changed.
- Before/after numbers.
- Coverage retained.
- Verification commands.
- Commit hash and push status.