nodejs-performance
Node.js Performance
Use this workflow to turn Node.js performance/resource investigations into safe, reviewable PRs.
Goals
- Improve execution time first: reduce p50/p95/p99 latency and increase throughput without changing intended behavior.
- Reduce CPU, memory, event-loop lag, I/O pressure, or lock contention when it supports execution-time gains.
- Ship small, isolated changes with measurable impact.
Operating Rules
- Work on one optimization per PR.
- Always choose the highest expected-impact task first.
- Confirm and respect intentional behaviors before changing them.
- Prefer low-risk changes in high-frequency paths.
- Prioritize request/job execution-path work over bootstrap/startup micro-optimizations unless startup is on the critical path at scale.
- Include evidence: targeted tests + before/after benchmark.
Impact-First Selection
Before coding, rank candidates using this score:
priority = (frequency x blast_radius x expected_gain) / (risk x effort)
Use 1-5 for each factor:
frequency: how often the path runs in production.blast_radius: how many requests/jobs/users are affected.expected_gain: estimated latency/resource improvement.risk: probability of behavior regression.effort: engineering time and change surface area.
Pick the top-ranked candidate, then validate with a baseline measurement.
If two candidates have similar score, pick the one with clearer end-to-end execution-time impact.
Prioritization Targets
Start with code that runs on every request/job/task:
- Request/job wrappers and middleware.
- Retry/timeout/circuit-breaker code.
- Connection pools (DB/Redis/HTTP) and socket reuse.
- Stream/pipeline transformations and buffering.
- Serialization/deserialization hot paths (JSON, parsers, schema validation).
- Queue consumers, schedulers, and worker dispatch.
- Event listener attach/detach lifecycle and cleanup logic.
Deprioritize unless justified by production profile:
- One-time startup/bootstrap code.
- Rare admin/debug-only flows.
- Teardown paths that are not on the steady-state critical path.
Common Hot-Path Smells
- Recomputing invariant values per invocation.
- Re-parsing code/AST repeatedly.
- Duplicate async lookups returning the same value.
- Per-call heavy object allocation in common-case parsing.
- Unnecessary awaits in teardown/close/dispose paths.
- Missing fast paths for dominant input shapes.
- Unbounded retries or retry storms under degraded dependencies.
- Excessive concurrency causing memory spikes or downstream saturation.
- Work done for logging/telemetry/metrics formatting even when disabled.
Execution Workflow
- Pick one candidate
- Rank candidates and pick the highest priority score.
- Explain the issue in one sentence.
- State expected impact (CPU, latency, memory, event-loop lag, I/O, contention).
- Prove it is hot
- Add a focused micro-benchmark or scenario benchmark.
- Capture baseline numbers before editing.
- Prefer scenario benchmarks that include real request/job flow when the goal is execution-time improvement.
- For resource issues, capture process metrics (
rss, heap, FD count, event-loop delay).
- Design minimal fix
- Keep behavior-compatible defaults.
- Add fallback path for edge cases.
- Avoid broad refactors in the same PR.
- Implement
- Make the smallest patch that removes repeated work.
- Keep interfaces stable unless change is necessary.
- Test
- Add/adjust targeted tests for new behavior and regressions.
- Run relevant package tests (not only whole-monorepo by default).
- Add concurrency/degradation tests when the bug appears only under load.
- Benchmark again
- Re-run the same benchmark with same parameters.
- Report absolute and relative deltas.
- Include latency deltas first (p50/p95/p99, throughput), then resource deltas when applicable.
- Package PR
- Branch naming:
codex/perf-<area>-<change>. - Commit message:
perf(<package>): <what changed>. - Include risk notes and rollback simplicity.
- Iterate
- Wait for review, then move to next isolated improvement.
Benchmarking Guidance
- Keep benchmark scope narrow to isolate one change.
- Use warmup iterations.
- Measure both:
micro: operation-level overhead.scenario: request/job flow, concurrency, and degraded dependency condition.- For execution-time work, scenario numbers are the decision-maker; micro numbers are supporting evidence.
- Always print:
- total time
- per-op time
- p50/p95/p99 latency when applicable
- speedup ratio
- iterations and workload shape
- resource counters (
rss, heap, handles, event-loop delay) when relevant
Resource Exhaustion Checklist
- Cap concurrency at each boundary (ingress, queue, downstream clients).
- Ensure timeout + cancellation are wired end-to-end.
- Ensure retries are bounded and jittered.
- Confirm listeners/timers/intervals are always cleaned up.
- Confirm streams are closed/destroyed on success and error paths.
- Confirm object caches have size/TTL controls.
CI / Flake Handling
- If CI-only failures appear, add temporary diagnostic payloads in tests.
- Serialize only affected flaky tests when resource contention is the cause.
- Keep determinism improvements in test code, not production code, unless required.
Output Template
For each PR, report:
- Issue being fixed.
- Why it matters under load.
- Code locations changed.
- Tests run and results.
- Benchmark before/after numbers (execution first: p50/p95/p99 and throughput).
- Risk assessment.
- Next candidate optimization.
More from kikobeats/skills
k8s-hpa-cost-tuning
Tune Kubernetes HPA scale-up/down behavior, topology spread, and resource requests to reduce idle cluster capacity. Use when users need to audit cluster costs on a schedule, analyze post-incident scaling behavior, investigate why replicas or nodes do not scale down, or reduce over-reservation and wasted compute resources.
14optimo
Optimize and convert images and videos using format-specific compression pipelines on top of ImageMagick and FFmpeg. Use when users need to reduce image or video file sizes, batch-optimize a media directory, convert between formats (JPEG, PNG, WebP, AVIF, HEIC, JXL, MP4, WebM, MOV), resize media by percentage/dimensions/target file size, strip audio tracks from videos, or output optimized images as data URLs.
1html-get
Retrieve normalized, render-ready HTML from any URL using fetch or headless prerender. Use when users need to get rendered HTML from JavaScript-heavy pages, normalize relative URLs to absolute for downstream parsing, prepare HTML for metadata extraction pipelines, or choose between fast fetch and full browser rendering per URL.
1keyvhq
Build and operate key-value caching with @keyvhq/core and official storage adapters. Use when users need to add a cache layer to a Node.js module, store data with TTL expiration, choose between storage backends (in-memory, Redis, Mongo, MySQL, PostgreSQL, SQLite), implement cache-aside patterns with namespace isolation, or memoize function results.
1use-pnpm
Always use pnpm as the package manager. Use when installing, adding, or removing dependencies, running scripts, or any npm/yarn/pnpm command. Replaces npm and yarn with pnpm equivalents.
1browserless
Automate headless Chrome with a high-level Puppeteer wrapper for screenshots, PDFs, and content extraction. Use when users need to capture web page screenshots or PDFs programmatically, extract rendered HTML or text from JavaScript-heavy pages, check URL status codes, run Lighthouse audits, or build reliable headless browser automation pipelines.
1