synthesis-codebase-review
Enterprise-Grade Codebase Review
Purpose
A comprehensive, practical codebase audit methodology for projects of any size. Not every project needs every check — the tiered system ensures you apply the right level of rigor.
For the full detailed checklist, see references/detailed-checklist.md in this skill directory.
How to Use This Skill
Step 1: Assess Project Tier
Complete the Project Complexity Assessment to determine which tier applies.
Step 2: Review Applicable Sections
Each section and many individual items are marked with tier indicators:
- Essential (Tier 1) — Apply to ALL projects, even weekend hacks
- Standard (Tier 2) — Apply to team projects and production apps
- Enterprise (Tier 3) — Apply to large-scale, multi-team, or regulated systems
- Mission-Critical (Tier 4) — Apply to financial, healthcare, infrastructure, or high-stakes systems
Step 3: Skip What Doesn't Apply
- Tier 1: focus only on Essential items (~50 checks)
- Tier 2: include Essential and Standard items (~150 checks)
- Tier 3: include Essential, Standard, and Enterprise items (~400 checks)
- Tier 4: include everything (~900+ checks)
Pre-Flight Checklist
Complete these checks before starting any review. Skipping pre-flight has caused real wasted effort on real engagements.
Branch Selection
- Confirm which branch represents the current working state — do NOT assume
mainis current - Check the most recent commit date on the target branch. If
mainhas not been updated in weeks and there is an active branch with many commits ahead, you are likely reviewing a stale snapshot - Ask whether the team uses git-flow, trunk-based, or another model. In git-flow,
developis often the correct review target
Review Scope
- Confirm which directories to exclude (vendor/, node_modules/, generated/, etc.)
- Ask if a previous review has been conducted. If yes, obtain prior findings to enable delta review mode
- Confirm expected output format (markdown, PDF, etc.) and audience (engineering team, leadership, both)
Project Complexity Assessment
Score each characteristic (0 = No, 1 = Yes):
Scale & Users: >1 developer, >5 developers, >20 developers, >100 users, >10K users, >1M users
Business Criticality: Production system, downtime costs money, downtime costs >$10K/hour, breach would make news, contractual SLAs
Data Sensitivity: User accounts, PII, financial data, health data (HIPAA), regulated data (GDPR, SOX)
Architecture Complexity: >1 service, >5 services, database exists, multiple data stores, third-party integrations, >5 integrations
Operational Requirements: 99% uptime, 99.9% uptime, 99.99% uptime, dedicated ops/SRE team, 24/7 on-call
| Score Range | Tier | Description |
|---|---|---|
| 0-4 | Tier 1 - Essential | Solo/hobby projects, prototypes, internal tools |
| 5-10 | Tier 2 - Standard | Small team projects, production apps, startups |
| 11-18 | Tier 3 - Enterprise | Large teams, regulated industries, enterprise customers |
| 19+ | Tier 4 - Mission-Critical | Financial systems, healthcare, critical infrastructure |
Minimum Viable Review (15-Minute Quick Check)
Use this for a rapid health assessment. These are the absolute essentials that apply to ANY project.
Security Essentials (5 minutes)
- No secrets in code: run
git log -p | grep -i "password\|secret\|api_key\|token"— should return nothing - Dependencies not ancient: check for critical vulnerabilities (
npm audit,pip-audit, etc.) - HTTPS only for all external communication
- Input validated before use
- Auth exists if there are users
Code Health (5 minutes)
- It builds: clean build with no errors
- Tests exist and pass
- No obvious duplication (no copy-pasted files or massive repeated blocks)
- Readable: a new developer could understand the main flow
Operations Essentials (5 minutes)
- README exists with instructions on how to run it
- Documented or automated deployment process
- Application produces logs
- Errors logged or sent somewhere visible
- Config externalized (no hardcoded environment-specific values)
Quick Score: ___ / 15. If you score <12, address the gaps before proceeding.
Review Categories
The full detailed checklist is in references/detailed-checklist.md. Here is an overview of all 16 review categories:
1. Architecture & System Design
Architectural foundation, API design and contracts, service communication, data architecture. Evaluate whether the chosen patterns are appropriate for scale and team size.
2. Secrets, Credentials & Sensitive Data
Active secret scanning, secret type inventory, AI tool configuration files, comment-aware credential scanning, secret management, preventive controls. This section is CRITICAL for all tiers.
3. Code Duplication & Reusability
Duplication analysis, shared code and libraries, abstraction quality.
4. Code Quality, Efficiency & Optimization
Basic code quality, algorithmic efficiency, database efficiency, memory and resource efficiency, concurrency and thread safety.
5. Clean Code & Software Engineering Principles
Naming and readability, function design, SOLID principles, error handling, defensive programming.
6. Code Readability & AI/Human Maintainability
Human readability, documentation, AI and automation friendliness.
7. Testing
Test existence, coverage, test types (unit, integration, API, E2E, performance, security), test quality (verify tests actually test behavior, not just imports).
8. Security
Authentication, authorization, input validation, data protection, dependency security.
9. Multi-Tenancy (Tier 3+)
Tenant isolation, configuration, lifecycle.
10. Identity & SSO (Tier 3+)
SSO support, session management.
11. Scalability & Performance (Tier 2+)
Horizontal scaling, auto-scaling, response times, caching, CDN.
12. Reliability (Tier 2+)
Fault tolerance, data durability, backups, disaster recovery.
13. Observability (Tier 2+)
Logging, monitoring, alerting, distributed tracing.
14. Deployment & Operations
Build and deploy documentation and automation, deployment strategy, configuration management.
15. Licensing & Legal
Dependency licenses, intellectual property, attribution.
16. Developer Experience
Getting started documentation, development workflow, CI speed.
Addenda
- Open Source Software Addendum — License, community docs, security policy, versioning, distribution, contribution workflow, project health, testing, documentation
- Proprietary Software Addendum — Trade secret protection, vendor management, customer data protection
- Industry-Specific Addenda — Financial services, healthcare, e-commerce, government/public sector
Output Format
Key Principle: Strengths Before Findings
All reports lead with strengths before findings. Demonstrating that you understand what the team built well makes critical findings land as constructive guidance rather than an attack.
Tier 1-2: Simplified Report
## Codebase Review Summary
**Project**: [Name]
**Tier**: [1-Essential / 2-Standard]
**Date**: YYYY-MM-DD
### Quick Health Check: Pass / Issues / Fail
### Strengths
1. [What the codebase does well]
2. [Notable good practices]
### Key Findings
| # | Finding | Severity | Location | Fix |
|---|---------|----------|----------|-----|
| 1 | [Description] | Critical/High/Medium/Low | `path:line` | [Action] |
### Recommended Actions
1. [Top priority action]
2. [Second priority]
3. [Third priority]
Tier 3-4: Full Report
Include executive summary with overall score, per-category scores, top strengths, top critical findings, detailed findings with severity/location/evidence/recommendation/effort, and a phased action plan (immediate, short-term, medium-term).
Delta Review Mode
When a prior review exists, use delta mode. A standalone review says "here are your problems." A delta review says "here is your trajectory." The second is far more useful for engineering leadership.
For each finding from the prior review, classify its current status:
| Status | Meaning |
|---|---|
| Fixed | Finding fully resolved |
| Partially Fixed | Improvement made but not complete |
| Still Present | No change — deferred or not yet addressed |
| Worse | Finding has regressed or expanded in scope |
| New | Finding not present in prior review |
Deliverable Organization
Date-stamp review deliverables in folders:
reviews/
├── 2025-01-15/
│ ├── review-summary.md
│ ├── detailed-findings.md
│ └── executive-report.pdf
├── 2025-04-15/
│ ├── delta-review.md ← compares against 2025-01-15
│ ├── detailed-findings.md
│ └── executive-report.pdf
Relationship to Other Verification Skills
This skill evaluates the entire system. Four companion skills cover narrower scopes:
- synthesis-implementation-integrity — verifies a single change is genuinely complete (self-review, run after every implementation)
- synthesis-code-audit — systematic 10-dimension quality scan of a diff (run after implementation, as input to preflight or pr-review)
- synthesis-preflight — branch readiness gate that orchestrates tests, types, audit, and commit hygiene into a go/no-go verdict (run before creating a PR)
- synthesis-pr-review — evaluates a change proposed for merge (peer review, run on every PR)
Together these five skills form a verification chain: implementation-integrity catches change-level gaps, code-audit measures quality across 10 dimensions, preflight gates the branch before it becomes a PR, pr-review catches what earlier checks missed through judgment-based evaluation, and this skill catches systemic patterns that no single change reveals.
Key Principles
- Tier-appropriate rigor. Do not apply Tier 4 scrutiny to a weekend project. Do not skip Tier 1 basics for an enterprise system.
- Evidence-based findings. Every finding must cite specific file paths and line numbers with concrete remediation steps.
- Practical over theoretical. Every checklist item should catch real issues found in real codebases.
- AI-friendly wording. Checklist items should be clear enough for AI assistants to evaluate programmatically.
- Strengths first. Understanding what the team did well is as important as finding problems.
More from rajivpant/synthesis-skills
synthesis-fact-checking
Systematic fact-checking process for verifying claims in articles and blog posts, particularly those synthesized from multiple AI deep-research outputs. Use when asked to: fact-check, verify claims, verify sources, check accuracy, citation verification, review factual accuracy, validate references.
17synthesis-thinking-framework
Five-mode thinking methodology (first principles, systems thinking, complexity thinking, analogical thinking, design thinking) with a pre-response protocol for non-trivial problems. Provides the foundational reasoning approach that other synthesis skills build upon.
15synthesis-article-writing
>
14synthesis-context-lifecycle
Three-tier context architecture for managing AI working memory across long-running projects. Use when asked to: manage context, project context, session management, context lifecycle, working memory, archival, archive sessions, context maintenance, garbage collection for context, tiered context.
14synthesis-code-planning
Structured approach to code generation, implementing features, and writing code. Use when asked to generate code, implement a feature, write code, or tackle a coding task. Analyzes the task, generates multiple approaches with trade-offs, selects the optimal solution, and implements it.
14synthesis-clean-text
Produce text without watermarking patterns, invisible characters, or statistical fingerprints that identify text as AI-generated. Use when generating clean text, avoiding watermarks, addressing AI detection concerns, or producing text generation output that should be free of hidden markers.
13