code-review-and-quality
Code Review and Quality
Overview
Multi-dimensional code review with quality gates. Every change gets reviewed before merge — no exceptions. Review covers five axes: correctness, readability, architecture, security, and performance. The standard is: "Would a staff engineer approve this diff and the verification story?"
When to Use
- Before merging any PR or change
- After completing a feature implementation
- When another agent or model produced code you need to evaluate
- When refactoring existing code
- After any bug fix (review both the fix and the regression test)
The Five-Axis Review
Every review evaluates code across these dimensions:
1. Correctness
Does the code do what it claims to do?
- Does it match the spec or task requirements?
- Are edge cases handled (null, empty, boundary values)?
- Are error paths handled (not just the happy path)?
- Does it pass all tests? Are the tests actually testing the right things?
- Are there off-by-one errors, race conditions, or state inconsistencies?
2. Readability & Simplicity
Can another engineer (or agent) understand this code without the author explaining it?
- Are names descriptive and consistent with project conventions? (No
temp,data,resultwithout context) - Is the control flow straightforward (avoid nested ternaries, deep callbacks)?
- Is the code organized logically (related code grouped, clear module boundaries)?
- Are there any "clever" tricks that should be simplified?
- Could this be done in fewer lines? (1000 lines where 100 suffice is a failure)
- Are abstractions earning their complexity? (Don't generalize until the third use case)
- Would comments help clarify non-obvious intent? (But don't comment obvious code.)
- Are there dead code artifacts: no-op variables (
_unused), backwards-compat shims, or// removedcomments?
3. Architecture
Does the change fit the system's design?
- Does it follow existing patterns or introduce a new one? If new, is it justified?
- Does it maintain clean module boundaries?
- Is there code duplication that should be shared?
- Are dependencies flowing in the right direction (no circular dependencies)?
- Is the abstraction level appropriate (not over-engineered, not too coupled)?
4. Security
Does the change introduce vulnerabilities?
- Is user input validated and sanitized?
- Are secrets kept out of code, logs, and version control?
- Is authentication/authorization checked where needed?
- Are SQL queries parameterized (no string concatenation)?
- Are outputs encoded to prevent XSS?
- Are dependencies from trusted sources with no known vulnerabilities?
5. Performance
Does the change introduce performance problems?
- Any N+1 query patterns?
- Any unbounded loops or unconstrained data fetching?
- Any synchronous operations that should be async?
- Any unnecessary re-renders in UI components?
- Any missing pagination on list endpoints?
- Any large objects created in hot paths?
Review Process
Step 1: Understand the Context
Before looking at code, understand the intent:
- What is this change trying to accomplish?
- What spec or task does it implement?
- What is the expected behavior change?
Step 2: Review the Tests First
Tests reveal intent and coverage:
- Do tests exist for the change?
- Do they test behavior (not implementation details)?
- Are edge cases covered?
- Do tests have descriptive names?
- Would the tests catch a regression if the code changed?
Step 3: Review the Implementation
Walk through the code with the five axes in mind:
For each file changed:
1. Correctness: Does this code do what the test says it should?
2. Readability: Can I understand this without help?
3. Architecture: Does this fit the system?
4. Security: Any vulnerabilities?
5. Performance: Any bottlenecks?
Step 4: Categorize Findings
| Category | Action | Example |
|---|---|---|
| Critical | Must fix before merge | Security vulnerability, data loss risk, broken functionality |
| Important | Should fix before merge | Missing test, wrong abstraction, poor error handling |
| Suggestion | Consider for improvement | Naming improvement, code style preference, optional optimization |
| Nitpick | Take it or leave it | Formatting, comment wording (skip these in AI reviews) |
Step 5: Verify the Verification
Check the author's verification story:
- What tests were run?
- Did the build pass?
- Was the change tested manually?
- Are there screenshots for UI changes?
- Is there a before/after comparison?
Multi-Model Review Pattern
Use different models for different review perspectives:
Model A writes the code
│
▼
Model B reviews for correctness and architecture
│
▼
Model A addresses the feedback
│
▼
Human makes the final call
This catches issues that a single model might miss — different models have different blind spots.
Example prompt for a review agent:
Review this code change for correctness, security, and adherence to
our project conventions. The spec says [X]. The change should [Y].
Flag any issues as Critical, Important, or Suggestion.
Dead Code Hygiene
After any refactoring or implementation change, check for orphaned code:
- Identify code that is now unreachable or unused
- List it explicitly
- Ask before deleting: "Should I remove these now-unused elements: [list]?"
Don't leave dead code lying around — it confuses future readers and agents. But don't silently delete things you're not sure about. When in doubt, ask.
DEAD CODE IDENTIFIED:
- formatLegacyDate() in src/utils/date.ts — replaced by formatDate()
- OldTaskCard component in src/components/ — replaced by TaskCard
- LEGACY_API_URL constant in src/config.ts — no remaining references
→ Safe to remove these?
Honesty in Review
When reviewing code — whether written by you, another agent, or a human:
- Don't rubber-stamp. "LGTM" without evidence of review helps no one.
- Don't soften real issues. "This might be a minor concern" when it's a bug that will hit production is dishonest.
- Quantify problems when possible. "This N+1 query will add ~50ms per item in the list" is better than "this could be slow."
- Push back on approaches with clear problems. Sycophancy is a failure mode in reviews. If the implementation has issues, say so directly and propose alternatives.
- Accept override gracefully. If the author has full context and disagrees, defer to their judgment.
Dependency Discipline
Part of code review is dependency review:
Before adding any dependency:
- Does the existing stack solve this? (Often it does.)
- How large is the dependency? (Check bundle impact.)
- Is it actively maintained? (Check last commit, open issues.)
- Does it have known vulnerabilities? (
npm audit) - What's the license? (Must be compatible with the project.)
Rule: Prefer standard library and existing utilities over new dependencies. Every dependency is a liability.
The Review Checklist
## Review: [PR/Change title]
### Context
- [ ] I understand what this change does and why
### Correctness
- [ ] Change matches spec/task requirements
- [ ] Edge cases handled
- [ ] Error paths handled
- [ ] Tests cover the change adequately
### Readability
- [ ] Names are clear and consistent
- [ ] Logic is straightforward
- [ ] No unnecessary complexity
### Architecture
- [ ] Follows existing patterns
- [ ] No unnecessary coupling or dependencies
- [ ] Appropriate abstraction level
### Security
- [ ] No secrets in code
- [ ] Input validated at boundaries
- [ ] No injection vulnerabilities
- [ ] Auth checks in place
### Performance
- [ ] No N+1 patterns
- [ ] No unbounded operations
- [ ] Pagination on list endpoints
### Verification
- [ ] Tests pass
- [ ] Build succeeds
- [ ] Manual verification done (if applicable)
### Verdict
- [ ] **Approve** — Ready to merge
- [ ] **Request changes** — Issues must be addressed
Common Rationalizations
| Rationalization | Reality |
|---|---|
| "It works, that's good enough" | Working code that's unreadable, insecure, or architecturally wrong creates debt that compounds. |
| "I wrote it, so I know it's correct" | Authors are blind to their own assumptions. Every change benefits from another set of eyes. |
| "We'll clean it up later" | Later never comes. The review is the quality gate — use it. |
| "AI-generated code is probably fine" | AI code needs more scrutiny, not less. It's confident and plausible, even when wrong. |
| "The tests pass, so it's good" | Tests are necessary but not sufficient. They don't catch architecture problems, security issues, or readability concerns. |
Red Flags
- PRs merged without any review
- Review that only checks if tests pass (ignoring other axes)
- "LGTM" without evidence of actual review
- Security-sensitive changes without security-focused review
- Large PRs that are "too big to review properly"
- No regression tests with bug fix PRs
Verification
After review is complete:
- All Critical issues are resolved
- All Important issues are resolved or explicitly deferred with justification
- Tests pass
- Build succeeds
- The verification story is documented (what changed, how it was verified)