analysis-codebase
Analyzing Codebase
Overview
Systematically analyze existing codebases to extract structural information. Supports three modes: Context (project characteristics), Brownfield (entities and collision risks), and Setup-Brownfield (comprehensive analysis for /humaninloop:setup).
When to Use
- Setting up constitution on existing codebase (brownfield projects)
- Planning new features against existing code
- Understanding tech stack before making changes
- Detecting collision risks for new entities or endpoints
- Running
/humaninloop:setupon projects with existing code - Gathering project context for governance decisions
When NOT to Use
- Greenfield projects: No existing code to analyze; start with
humaninloop:authoring-constitutiondirectly - Single-file scripts: No architectural patterns to extract
- Documentation-only review: Use standard file reading instead
- Before project directory exists: Nothing to analyze yet
- When user provides complete context: Skip analysis if user already documented tech stack and patterns
Common Mistakes
| Mistake | Problem | Fix |
|---|---|---|
| Assuming framework | Guessing without evidence | Verify with code patterns |
| Missing directories | Only checking standard paths | Projects vary, explore |
| Over-extracting | Analyzing every file | Focus on config and patterns |
| Ignoring governance | Missing existing decisions | Check README, CLAUDE.md, ADRs |
| Inventing findings | Documenting assumptions | Only report what is found |
Mode Selection
| Mode | When to Use | Output |
|---|---|---|
| Context | Setting up constitution, understanding project DNA | Markdown report for humans |
| Brownfield | Planning new features against existing code | JSON inventory with collision risks |
| Setup-Brownfield | /humaninloop:setup on existing codebase |
codebase-analysis.md with inventory + assessment |
Project Type Detection
Identify project type from package manager files:
| File | Project Type |
|---|---|
package.json |
Node.js/JavaScript/TypeScript |
pyproject.toml / requirements.txt |
Python |
go.mod |
Go |
Cargo.toml |
Rust |
pom.xml / build.gradle |
Java |
Gemfile |
Ruby |
pubspec.yaml |
Flutter/Dart |
Framework Detection
Web Frameworks
| Framework | Indicators |
|---|---|
| Express | express(), router.get(), app.use() |
| FastAPI | @app.get(), FastAPI(), APIRouter |
| Django | urls.py, views.py, models.py pattern |
| Flask | @app.route(), @bp.route() |
| Rails | routes.rb, app/models/, app/controllers/ |
| Spring | @RestController, @GetMapping, @Entity |
| Gin/Echo | r.GET(), e.GET() |
ORM/Database Frameworks
| Framework | Indicators |
|---|---|
| Prisma | schema.prisma, @prisma/client |
| TypeORM | @Entity(), @Column(), DataSource |
| SQLAlchemy | Base, db.Model, Column() |
| Django ORM | models.Model, models.CharField |
| GORM | gorm.Model, db.AutoMigrate |
| Mongoose | mongoose.Schema, new Schema({ |
| ActiveRecord | ApplicationRecord, has_many |
Architecture Pattern Recognition
| Pattern | Indicators |
|---|---|
| Layered | src/models/, src/services/, src/controllers/ |
| Feature-based | src/auth/, src/users/, src/tasks/ |
| Microservices | Multiple package files, docker compose |
| Serverless | serverless.yml, lambda/, functions/ |
| MVC | models/, views/, controllers/ |
| Clean/Hexagonal | domain/, application/, infrastructure/ |
Mode: Context Gathering
For constitution authoring - gather broad project characteristics.
What to Extract:
- Tech stack with versions
- Linting/formatting conventions
- CI/CD quality gates
- Team signals (test coverage, required approvals, CODEOWNERS)
- Existing governance docs (CODEOWNERS, ADRs, CONTRIBUTING.md)
Output: Project Context Report (markdown)
See references/CONTEXT-GATHERING.md for detailed guidance.
Mode: Brownfield Analysis
For planning - extract structural details for collision detection.
What to Extract:
- Entities with fields and relationships
- Endpoints with handlers
- Collision risks against proposed spec
Output: Codebase Inventory (JSON)
See references/BROWNFIELD-ANALYSIS.md for detailed guidance.
Mode: Setup Brownfield
For /humaninloop:setup - comprehensive analysis combining Context + Brownfield with Essential Floor assessment.
What to Extract:
- Everything from Context mode (tech stack, conventions, architecture)
- Everything from Brownfield mode (entities, relationships)
- Essential Floor assessment (Security, Testing, Error Handling, Observability)
- Inconsistencies and strengths assessment
Output: .humaninloop/memory/codebase-analysis.md following codebase-analysis-template.md
Essential Floor Analysis
Assess each of the four essential floor categories:
Security Assessment
| Check | How to Detect | Status Values |
|---|---|---|
| Auth at boundaries | Middleware patterns (authenticate, authorize, requireAuth) |
present/partial/absent |
| Secrets from env | .env.example exists, no hardcoded credentials in code |
present/partial/absent |
| Input validation | Schema validation libraries, input checking patterns | present/partial/absent |
Indicators to search:
# Auth middleware
grep -r "authenticate\|authorize\|requireAuth\|isAuthenticated" src/ 2>/dev/null
# Environment variables
ls .env.example .env.sample 2>/dev/null
grep -r "process.env\|os.environ\|os.Getenv" src/ 2>/dev/null
# Validation
grep -r "zod\|yup\|joi\|pydantic\|validator" package.json pyproject.toml 2>/dev/null
Testing Assessment
| Check | How to Detect | Status Values |
|---|---|---|
| Test framework configured | Config files (jest.config.*, pytest.ini, vitest.config.*) |
present/partial/absent |
| Test files present | Files matching *.test.*, *_test.*, test_*.* |
present/partial/absent |
| CI runs tests | Test commands in workflow files | present/partial/absent |
Indicators to search:
# Test config
ls jest.config.* vitest.config.* pytest.ini pyproject.toml 2>/dev/null
# Test files
find . -name "*.test.*" -o -name "*_test.*" -o -name "test_*.*" 2>/dev/null | head -5
# CI test commands
grep -r "npm test\|yarn test\|pytest\|go test" .github/workflows/ 2>/dev/null
Error Handling Assessment
| Check | How to Detect | Status Values |
|---|---|---|
| Explicit error types | Custom error classes/types defined | present/partial/absent |
| Context preservation | Error messages include context, stack traces logged | present/partial/absent |
| Appropriate status codes | API responses use correct HTTP status codes | present/partial/absent |
Indicators to search:
# Custom errors
grep -r "class.*Error\|extends Error\|Exception" src/ 2>/dev/null | head -5
# Error logging
grep -r "error.*context\|error.*stack\|logger.error" src/ 2>/dev/null | head -3
# Status codes
grep -r "status(4\|status(5\|HttpStatus\|status_code" src/ 2>/dev/null | head -3
Observability Assessment
| Check | How to Detect | Status Values |
|---|---|---|
| Structured logging | Logger config (winston, pino, structlog, logrus) | present/partial/absent |
| Correlation IDs | Request ID middleware, trace ID patterns | present/partial/absent |
| No PII in logs | Log sanitization, no email/password in log statements | present/partial/absent |
Indicators to search:
# Logger config
grep -r "winston\|pino\|structlog\|logrus\|zap" package.json pyproject.toml go.mod 2>/dev/null
# Correlation IDs
grep -r "requestId\|correlationId\|traceId\|x-request-id" src/ 2>/dev/null | head -3
# PII check (negative - should NOT find these in logs)
grep -r "logger.*email\|logger.*password\|log.*password" src/ 2>/dev/null
Setup-Brownfield Quality Checklist
Before finalizing setup-brownfield analysis:
- Project identity complete (name, language, framework, entry points)
- Directory structure documented with purposes
- Architecture pattern identified with evidence
- Naming conventions documented (files, variables, functions, classes)
- All four Essential Floor categories assessed
- Domain entities extracted with relationships
- External dependencies documented
- Strengths to preserve identified (minimum 2-3)
- Inconsistencies documented with severity
- Recommendations provided for constitution focus
Detection Script
Run the automated detection script for fast, deterministic stack identification:
bash scripts/detect-stack.sh /path/to/project
Output:
{
"project_type": "nodejs",
"package_manager": "npm",
"frameworks": ["express"],
"orms": ["prisma"],
"architecture": ["feature-based"],
"ci_cd": ["github-actions"],
"files_found": {...}
}
The script detects:
- Project type: nodejs, python, go, rust, java, ruby, flutter, elixir
- Package manager: npm, yarn, pnpm, pip, poetry, cargo, etc.
- Frameworks: express, fastapi, django, nextjs, gin, rails, spring-boot, etc.
- ORMs: prisma, typeorm, sqlalchemy, mongoose, gorm, activerecord, etc.
- Architecture: clean-architecture, mvc, layered, feature-based, serverless, microservices
- CI/CD: github-actions, gitlab-ci, jenkins, circleci, etc.
Usage pattern:
- Run script first for deterministic baseline
- Use script output to guide deeper LLM analysis
- Script findings are ground truth; LLM adds nuance
Manual Detection Commands
For cases where script detection is insufficient:
# Tech stack detection
cat package.json | jq '{name, engines, dependencies}'
cat pyproject.toml
cat .tool-versions .nvmrc .python-version 2>/dev/null
# Architecture detection
ls -d src/domain src/application src/features 2>/dev/null
# CI/CD detection
ls .github/workflows/*.yml .gitlab-ci.yml 2>/dev/null
# Governance detection
ls CODEOWNERS .github/CODEOWNERS docs/CODEOWNERS 2>/dev/null
cat CODEOWNERS 2>/dev/null | head -20
# Test structure
ls -d test/ tests/ spec/ __tests__/ 2>/dev/null
Quality Checklist
Before finalizing analysis:
Both Modes:
- Project type and framework correctly identified
- Architecture pattern documented
- File paths cited for all findings
Context Mode:
- Existing linting/formatting config extracted
- CI quality gates analyzed
- Existing governance docs checked (CODEOWNERS, ADRs, CONTRIBUTING.md)
- Approvers identified (from CODEOWNERS or team structure)
- Recommendations provided
Brownfield Mode:
- All entity directories scanned
- All route directories scanned
- Collision risks classified by severity
Setup-Brownfield Mode:
- All Context Mode checks completed
- All four Essential Floor categories assessed
- Strengths and inconsistencies documented
- Output written to
.humaninloop/memory/codebase-analysis.md
Related Skills
- For brownfield constitutions: REQUIRED: Use humaninloop:brownfield-constitution after analysis
- For greenfield projects: OPTIONAL: Use humaninloop:authoring-constitution directly
- For validation: OPTIONAL: Use humaninloop:validation-constitution after constitution creation