improve
Improve
Structured improvement pass grounded in what was observed during this session.
Principles (Always Active)
- Every suggestion must cite a specific observation — file path, error seen, pattern noticed, or behavior verified. "I saw X" beats "common practice suggests Y."
- Distinguish grounded suggestions (observed evidence) from speculative ones (general knowledge). Mark speculative suggestions explicitly.
- Prioritize by impact to the user, not by ease of description.
- Do not suggest improvements that would change public API contracts, auth boundaries, or data schemas without flagging as high-risk.
- Simplicity is a feature. Do not suggest adding complexity unless it prevents a concrete problem observed in this session.
When to Use
- User asks "knowing what you know now, what are some improvements?" or similar
- Before claiming done in a ralph loop (self-check)
- Before generating a handoff
- After completing a significant chunk of work
- When reviewing code that was just written or modified
When Not to Use
- At the start of a session before meaningful work has been done
- When the user explicitly asks to skip review and ship
- For architectural redesigns (use spec/plan flow instead)
Dimensions
Analyze across these dimensions. Skip any dimension with nothing grounded to say.
Correctness
Error handling gaps, unhandled edge cases, type safety issues, race conditions, missing validation at system boundaries.
Simplicity
Unnecessary indirection, dead code, naming that obscures intent, over-engineered abstractions, complexity that doesn't earn its keep. Reference the simplify skill for implementation.
Security
Auth boundary gaps, input validation holes, secret handling issues, OWASP concerns observed in touched code.
Test Coverage
Untested code paths observed during the session, missing test layers (unit/integration/e2e), fixture realism gaps, assertions that don't verify meaningful behavior.
Performance
Bottlenecks observed during the session (slow queries, N+1 patterns, unnecessary recomputation). Only flag what was actually observed, not hypothetical.
Developer Experience
API ergonomics issues, missing documentation for non-obvious behavior, CLI friction, configuration that should be extracted.
Workflow
-
Gather context: Review what happened this session — files changed, commands run, errors encountered, tests written, patterns established.
-
Analyze each dimension: For each, check whether session observations surface anything concrete. Skip dimensions with nothing to say.
-
Structure each suggestion:
[IMP-N] effort/gate: summary
Observation: what was seen (file:line, error message, test gap, etc.)
Suggestion: concrete change
Grounded: yes/no
Where:
effort:trivial(< 5 min) |small(< 30 min) |medium(hours)gate: which delivery gate this strengthens:TDD|DEV|E2E|REVIEW|CI
-
Sort by impact (highest first), not by dimension.
-
Present as actionable list. If in a ralph loop, address trivial/small improvements inline before claiming done. Log medium improvements as decisions for the user to prioritize.
Integration with Ralph Loops
When running inside a ralph loop (.rl/state.json exists):
- Run the improvement pass before
.rl/rl done - Address
trivialimprovements inline — just fix them - Address
smallimprovements if they're within the current milestone scope - Log
mediumimprovements:.rl/rl log decision "IMP-N deferred: [reason]" - Do not expand scope beyond the task's milestones for medium improvements
Integration with Handoffs
When generating a handoff, include unaddressed improvements in the <next_steps> section. Use the same [IMP-N] format so the receiving agent can reference them.
Output Format
## Improvements
N suggestions (X grounded, Y speculative)
[IMP-1] trivial/DEV: Extract magic number to config constant
Observation: `src/server.ts:42` uses hardcoded port 3000, but other services read from env
Suggestion: Read from `PORT` env var with 3000 as default
Grounded: yes
[IMP-2] small/TDD: Add boundary test for empty input
Observation: `processItems([])` path untested — saw it handle non-empty arrays only in test suite
Suggestion: Add test case for empty array input in `process-items.test.ts`
Grounded: yes
[IMP-3] medium/E2E: Add timeout handling for external API calls
Observation: Catalog search took 1.2s avg in profiling — no timeout configured
Suggestion: Add configurable timeout with bounded retry
Grounded: yes
Red Flags
- Suggestions without observations — if you can't point to something specific, don't suggest it
- More than 10 suggestions — focus on the top 5-7 by impact
- Suggestions that expand scope beyond what was touched — stay within the blast radius of the current work
- Generic "add more tests" without specifying which paths are untested