Code Review and Quality

Overview

Multi-dimensional code review with quality gates. Every change gets reviewed before merge — no exceptions. Review covers five axes: correctness, readability, architecture, security, and performance. The standard is: "Would a staff engineer approve this diff and the verification story?"

When to Use

Before merging any PR or change
After completing a feature implementation
When another agent or model produced code you need to evaluate
When refactoring existing code
After any bug fix (review both the fix and the regression test)

The Five-Axis Review

Every review evaluates code across these dimensions:

1. Correctness

Does the code do what it claims to do?

Does it match the spec or task requirements?
Are edge cases handled (null, empty, boundary values)?
Are error paths handled (not just the happy path)?
Does it pass all tests? Are the tests actually testing the right things?
Are there off-by-one errors, race conditions, or state inconsistencies?

2. Readability & Simplicity

Can another engineer (or agent) understand this code without the author explaining it?

Are names descriptive and consistent with project conventions? (No temp, data, result without context)
Is the control flow straightforward (avoid nested ternaries, deep callbacks)?
Is the code organized logically (related code grouped, clear module boundaries)?
Are there any "clever" tricks that should be simplified?
Could this be done in fewer lines? (1000 lines where 100 suffice is a failure)
Are abstractions earning their complexity? (Don't generalize until the third use case)
Would comments help clarify non-obvious intent? (But don't comment obvious code.)
Are there dead code artifacts: no-op variables (_unused), backwards-compat shims, or // removed comments?

3. Architecture

Does the change fit the system's design?

Does it follow existing patterns or introduce a new one? If new, is it justified?
Does it maintain clean module boundaries?
Is there code duplication that should be shared?
Are dependencies flowing in the right direction (no circular dependencies)?
Is the abstraction level appropriate (not over-engineered, not too coupled)?

4. Security

Does the change introduce vulnerabilities?

Is user input validated and sanitized?
Are secrets kept out of code, logs, and version control?
Is authentication/authorization checked where needed?
Are SQL queries parameterized (no string concatenation)?
Are outputs encoded to prevent XSS?
Are dependencies from trusted sources with no known vulnerabilities?

5. Performance

Does the change introduce performance problems?

Any N+1 query patterns?
Any unbounded loops or unconstrained data fetching?
Any synchronous operations that should be async?
Any unnecessary re-renders in UI components?
Any missing pagination on list endpoints?
Any large objects created in hot paths?

Review Process

Step 1: Understand the Context

Before looking at code, understand the intent:

- What is this change trying to accomplish?
- What spec or task does it implement?
- What is the expected behavior change?

Step 2: Review the Tests First

Tests reveal intent and coverage:

- Do tests exist for the change?
- Do they test behavior (not implementation details)?
- Are edge cases covered?
- Do tests have descriptive names?
- Would the tests catch a regression if the code changed?

Step 3: Review the Implementation

Walk through the code with the five axes in mind:

For each file changed:
1. Correctness: Does this code do what the test says it should?
2. Readability: Can I understand this without help?
3. Architecture: Does this fit the system?
4. Security: Any vulnerabilities?
5. Performance: Any bottlenecks?

Step 4: Categorize Findings

Category	Action	Example
Critical	Must fix before merge	Security vulnerability, data loss risk, broken functionality
Important	Should fix before merge	Missing test, wrong abstraction, poor error handling
Suggestion	Consider for improvement	Naming improvement, code style preference, optional optimization
Nitpick	Take it or leave it	Formatting, comment wording (skip these in AI reviews)

Step 5: Verify the Verification

Check the author's verification story:

- What tests were run?
- Did the build pass?
- Was the change tested manually?
- Are there screenshots for UI changes?
- Is there a before/after comparison?

Multi-Model Review Pattern

Use different models for different review perspectives:

Model A writes the code
    │
    ▼
Model B reviews for correctness and architecture
    │
    ▼
Model A addresses the feedback
    │
    ▼
Human makes the final call

This catches issues that a single model might miss — different models have different blind spots.

Example prompt for a review agent:

Review this code change for correctness, security, and adherence to
our project conventions. The spec says [X]. The change should [Y].
Flag any issues as Critical, Important, or Suggestion.

Dead Code Hygiene

After any refactoring or implementation change, check for orphaned code:

Identify code that is now unreachable or unused
List it explicitly
Ask before deleting: "Should I remove these now-unused elements: [list]?"

Don't leave dead code lying around — it confuses future readers and agents. But don't silently delete things you're not sure about. When in doubt, ask.

DEAD CODE IDENTIFIED:
- formatLegacyDate() in src/utils/date.ts — replaced by formatDate()
- OldTaskCard component in src/components/ — replaced by TaskCard
- LEGACY_API_URL constant in src/config.ts — no remaining references
→ Safe to remove these?

Honesty in Review

When reviewing code — whether written by you, another agent, or a human:

Don't rubber-stamp. "LGTM" without evidence of review helps no one.
Don't soften real issues. "This might be a minor concern" when it's a bug that will hit production is dishonest.
Quantify problems when possible. "This N+1 query will add ~50ms per item in the list" is better than "this could be slow."
Push back on approaches with clear problems. Sycophancy is a failure mode in reviews. If the implementation has issues, say so directly and propose alternatives.
Accept override gracefully. If the author has full context and disagrees, defer to their judgment.

Dependency Discipline

Part of code review is dependency review:

Before adding any dependency:

Does the existing stack solve this? (Often it does.)
How large is the dependency? (Check bundle impact.)
Is it actively maintained? (Check last commit, open issues.)
Does it have known vulnerabilities? (npm audit)
What's the license? (Must be compatible with the project.)

Rule: Prefer standard library and existing utilities over new dependencies. Every dependency is a liability.

The Review Checklist

## Review: [PR/Change title]

### Context
- [ ] I understand what this change does and why

### Correctness
- [ ] Change matches spec/task requirements
- [ ] Edge cases handled
- [ ] Error paths handled
- [ ] Tests cover the change adequately

### Readability
- [ ] Names are clear and consistent
- [ ] Logic is straightforward
- [ ] No unnecessary complexity

### Architecture
- [ ] Follows existing patterns
- [ ] No unnecessary coupling or dependencies
- [ ] Appropriate abstraction level

### Security
- [ ] No secrets in code
- [ ] Input validated at boundaries
- [ ] No injection vulnerabilities
- [ ] Auth checks in place

### Performance
- [ ] No N+1 patterns
- [ ] No unbounded operations
- [ ] Pagination on list endpoints

### Verification
- [ ] Tests pass
- [ ] Build succeeds
- [ ] Manual verification done (if applicable)

### Verdict
- [ ] **Approve** — Ready to merge
- [ ] **Request changes** — Issues must be addressed

Common Rationalizations

Rationalization	Reality
"It works, that's good enough"	Working code that's unreadable, insecure, or architecturally wrong creates debt that compounds.
"I wrote it, so I know it's correct"	Authors are blind to their own assumptions. Every change benefits from another set of eyes.
"We'll clean it up later"	Later never comes. The review is the quality gate — use it.
"AI-generated code is probably fine"	AI code needs more scrutiny, not less. It's confident and plausible, even when wrong.
"The tests pass, so it's good"	Tests are necessary but not sufficient. They don't catch architecture problems, security issues, or readability concerns.

Red Flags

PRs merged without any review
Review that only checks if tests pass (ignoring other axes)
"LGTM" without evidence of actual review
Security-sensitive changes without security-focused review
Large PRs that are "too big to review properly"
No regression tests with bug fix PRs

Verification

After review is complete:

All Critical issues are resolved
All Important issues are resolved or explicitly deferred with justification
Tests pass
Build succeeds
The verification story is documented (what changed, how it was verified)

code-review-and-quality