Refactor

Review code changes in current branch against main, grade the quality, and iteratively refactor until the code meets a minimum quality threshold of 90/100.

Instructions

1. Get Changed Files

git diff main --name-only

Get the full diff for context:

git diff main

2. Run Vitest

Run Vitest in CI mode before grading:

pnpm vitest run --reporter=verbose

If tests fail, factor that into the grade (deduct from Test Coverage score). Don't run Playwright - ask the user to run those manually if needed.

3. Review and Grade

Review the code changes for:

Correctness: Does the code do what it's supposed to?
TypeScript: No any, proper types, strict mode compliance (see TypeScript Check below)
React patterns: Proper hooks usage, client/server boundaries
Code style: Follows project conventions (no semicolons, proper imports)
Error handling: Appropriate error boundaries and handling
Performance: No obvious N+1 queries, unnecessary re-renders
Security: No vulnerabilities (XSS, injection, etc.)
Accessibility: Proper aria attributes, semantic HTML
Test coverage: New logic/features have tests, edge cases covered (checks tests exist, doesn't run them). See Test Coverage Deep Check below.
DRY: No unnecessary duplication
Clarity: Code is readable and self-documenting

Also include all checks from /preflight (dead code, git hygiene, env vars, imports, dates, etc.). A score of 90+ means preflight would pass.

Assign a letter grade (A-F) and a score out of 100.

4. Branch Logic

If score >= 90

Output to terminal: "Good enough, let's ship it!"
List what could have made the score higher (but don't block on it)
Get the current branch name:
```
git branch --show-current
```
Create tmp/ directory in project root if it doesn't exist
Write suggestions to tmp/{branch-name}.md with:
- Final grade and score
- Summary of what's good
- List of potential improvements (nice-to-have)

If score < 90

Identify the issues bringing the score down
Refactor the code to fix those issues
After refactoring, re-review and grade again
Repeat until score >= 90
Then follow the >= 90 flow above

Grading Scale

Grade	Score Range	Meaning
A+	97-100	Exceptional - production ready, exemplary code
A	93-96	Excellent - minor nitpicks only
A-	90-92	Very Good - meets quality bar
B+	87-89	Good - small issues to address
B	83-86	Above Average - some improvements needed
B-	80-82	Satisfactory - multiple issues
C+	77-79	Adequate - significant issues
C	73-76	Acceptable - needs work
C-	70-72	Below Average - substantial issues
D	60-69	Poor - major problems
F	<60	Failing - fundamental issues

Output Format

Terminal Output (always)

## Code Review: {branch-name}

> **Tests:** Vitest ran in CI mode. Playwright skipped - run manually if needed.

**Grade: {grade}** ({score}/100)

### Summary
{Brief description of what the changes do}

### What's Good
- {positive point 1}
- {positive point 2}

### Issues Found
- {file}:{line} - {issue description}

### Score Breakdown
- Correctness: {x}/15
- TypeScript: {x}/10
- React Patterns: {x}/10
- Code Style: {x}/10
- Error Handling: {x}/10
- Performance: {x}/10
- Security: {x}/10
- Test Coverage: {x}/10
- Accessibility: {x}/5
- DRY: {x}/5
- Clarity: {x}/5

{If score >= 90: "Good enough, let's ship it!"}
{If score < 90: "Refactoring to improve score..."}

Markdown File (tmp/{branch-name}.md)

Only written when score >= 90 (either initially or after refactoring):

# Code Review: {branch-name}

**Date:** {date}
**Final Grade:** {grade} ({score}/100)

## Summary
{What the changes accomplish}

## What's Good
{List of positive aspects}

## Potential Improvements
{Things that could have pushed the score higher - nice to have, not blockers}

## Refactoring History (if applicable)
{If refactoring was needed, list what was changed and score progression}

TypeScript Check

Some projects have legacy type errors in unrelated files. To check only changed files:

# Check if 'types' target is available in make output
make 2>/dev/null | grep -q 'types' && make types || pnpm tsc --noEmit

Use make types if available (checks changed files only), otherwise fall back to pnpm tsc --noEmit.

Test Coverage Deep Check

Don't just check if new helper functions have tests - trace the code path:

Integration over unit: If you add a helper function getX(), find where it's called (e.g., calculateY()) and verify calculateY() has tests covering the new logic
Follow the call chain: New code in utils/foo.ts → used by calculateBar() → verify calculateBar.test.ts covers the new behavior
Test the behavior, not just the function: A test for getMonthsInBillingInterval() returning 3 is good, but a test proving calculateLettingInvoiceCosts() actually multiplies by 3 for quarterly is better

Example gap to catch:

❌ "Added tests for getMonthsInBillingInterval()" (tests the helper)
✅ "Added tests for calculateLettingInvoiceCosts() with QUARTERLY billing" (tests the integration)

Edge Case Analysis

Actively question defensive code patterns:

Fallback values: When you see x ?? defaultValue or x ? fn(x) : fallback, ask:
- What happens when the fallback is used?
- Is the fallback correct for all scenarios?
- Should this be an error instead of a silent default?
Optional chaining: When you see obj?.prop, trace what happens when obj is null:
- Does downstream code handle the undefined correctly?
- Are there tests for the null case?
Type narrowing: When interfaces allow null/undefined, verify the code handles both branches

Example to catch:

const months = billing?.interval ? getMonths(billing.interval) : 1
const cost = new Decimal(billing?.rate || 0).mul(months)

Ask: "If billing is null, we get 0 * 1 = 0. Is that correct, or should this throw?"

Notes

Focus on meaningful improvements, not bikeshedding
Each refactoring iteration should target the highest-impact issues first
Don't over-engineer - fix what's wrong, don't add unnecessary complexity
If stuck in a loop (same issues keep appearing), break out and explain why
Maximum 5 refactoring iterations to prevent infinite loops

refactor

Refactor

Instructions

1. Get Changed Files

2. Run Vitest

3. Review and Grade

4. Branch Logic

If score >= 90

If score < 90

Grading Scale

Output Format

Terminal Output (always)

Markdown File (tmp/{branch-name}.md)

TypeScript Check

Test Coverage Deep Check

Edge Case Analysis

Notes

More from cerico/macfair

infographic

visx

creative-design

threejs

prototype

test-review