code-review
Code Review
Code review is a judgment surface, not a generic style pass. Optimize for the highest-signal findings: correctness bugs, behavioral regressions, trust-boundary mistakes, performance risks, and missing verification.
When to use this skill
- Review a pull request, diff, merge request, or patch stack before merge
- Decide whether a change should be approved, blocked, or sent back for fixes
- Identify the most important bugs, regressions, or missing tests in a change
- Turn vague "please review this" requests into a risk-ranked findings list
- Check whether the stated intent of a change matches what the diff actually does
Prefer a neighboring skill when the main job is not reviewing a concrete code change:
testing-strategiesfor deciding validation policy before tests existdebuggingfor reproducing and isolating a known failuresecurity-best-practicesfor broader hardening guidance without a concrete diffperformance-optimizationfor measurement-led tuning work rather than review judgment
Instructions
Step 1: Establish review context
Start by pinning down:
- change intent: feature, bug fix, refactor, migration, dependency update, or rollout change
- blast radius: how many files or subsystems changed and whether critical paths moved
- verification evidence: tests, typecheck, lint, screenshots, or manual checks already provided
- risk class: auth, data integrity, billing, API compatibility, performance, or operational safety
If the request has no diff, patch, or code artifact to inspect, do not fake a review. Route to a better-fit planning or analysis skill instead.
Step 2: Look for findings before style comments
Prioritize this order:
- correctness and regression risk
- security and trust-boundary mistakes
- data-loss or compatibility risk
- missing or weak verification
- maintainability issues that are likely to cause future bugs
Do not lead with formatting or naming unless the change is otherwise clean.
Read references/review-priorities.md when the diff is large or the important
findings are hard to rank.
Step 3: Inspect the diff by failure mode
Check the change against these questions:
- Does the implementation actually satisfy the stated intent?
- Did control flow, error handling, or cleanup behavior change in a risky way?
- Are there hidden behavior changes at boundaries such as auth, persistence, API schemas, caching, concurrency, or background jobs?
- Did the change add new assumptions without validation, fallback handling, or migration support?
- Does the test evidence prove the risky paths, or only the happy path?
Prefer concrete evidence over generic checklists. A small focused finding beats ten broad reminders.
Step 4: Treat missing verification as a review issue
Review is not complete just because the code "looks fine."
- If the risky behavior is untested, say so explicitly
- Distinguish missing merge-blocking evidence from nice-to-have follow-up coverage
- Tie the missing test request to a concrete failure mode
- Route policy questions to
testing-strategiesand implementation work to the relevant testing skill
Read references/findings-format-and-severity.md when deciding whether to block
on missing tests or weaker evidence.
Step 5: Produce a findings-first review
Expected response order:
Findings— ordered by severity, each with file/line references when possibleOpen questions / assumptions— only if they materially affect correctnessChange summary— brief and secondary
Each finding should include:
- severity: critical, high, medium, or low
- what is wrong or risky
- why it matters in behavior, not only in style
- what evidence is missing or what change would address it
If there are no material findings, say so explicitly and still mention any residual risk or verification gaps.
Output format
Expected response shape:
Findings: findings first, ordered by severity, with file/line references where possibleOpen questions / assumptions: only unresolved issues that affect the review verdictChange summary: brief recap of what the diff doesResidual risk: anything not fully proven by available verification
Examples
Example 1: Review a risky auth change
Input:
Review this PR that changes token refresh handling and session invalidation.
Expected shape:
- focuses on auth and session-boundary regressions before style
- checks whether failure and expiry paths are actually covered
- calls out merge-blocking findings before summaries
Example 2: Review a broad refactor
Input:
Please review this refactor before I merge it. I mostly want to know if there are regression risks or missing tests.
Expected shape:
- looks for behavior drift and hidden assumptions instead of naming-only feedback
- treats missing regression coverage as a real review finding when warranted
- keeps any summary short after the findings list
Example 3: Route away when there is no diff
Input:
Review this architecture direction for our checkout redesign.
Expected shape:
- notes that this is not a code review because no concrete change set exists
- routes to a planning, architecture, or design-analysis surface instead
- does not pretend to produce diff findings without evidence
Best practices
- Start from the diff's real risk, not a canned checklist.
- Prefer a few concrete findings over a long list of weak comments.
- Treat missing test evidence as part of review quality.
- Keep findings behavior-focused and reference-backed.
- Add eval coverage before any
skill-autoresearchloop on this skill. - Move detailed heuristics into references so the entrypoint stays compact and triggerable.
References
- Local:
references/review-priorities.md - Local:
references/findings-format-and-severity.md