codebase-spec-extractor

Installation

SKILL.md

Codebase Spec Extractor

Overview

Extract engineering specifications from existing codebases with one goal: produce documentation so complete that the project can be fully replicated without seeing the original source code.

This is NOT about filling templates. This is about deconstructing every detail and documenting it with engineering precision.

Requirements

bash (macOS default Bash 3.2 supported)
Standard Unix tools: find, grep, sed, wc, date
Optional (recommended): rg (ripgrep) for faster scans in large repos

When to Use

You need implementation-grade specs for a codebase (migration, rewrite, onboarding, vendor handoff).
You keep finding “implicit rules” in code and want them made explicit.
You want a repeatable inventory + spec skeleton + verification workflow.

When NOT to Use

You only need a quick architecture overview or a lightweight README update.
Your main goal is security auditing, performance profiling, or code style cleanup (use dedicated workflows/tools).
The repository contains highly sensitive data and you cannot safely store path-rich reports/specs.

Core Principle: Deconstruction Thinking

The framework is a guide, not a boundary. Real projects are messy, creative, and often surprising. Your job is to:

Discover what exists (not assume what should exist)
Understand why it exists (not just what it does)
Document how to replicate it (not just describe it)

The Deconstruction Questions

For EVERY piece of code, structure, or behavior you encounter, ask:

┌─────────────────────────────────────────────────────────────────┐
│                 The 7 Deconstruction Questions                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. WHAT is this?                                               │
│     → Name it. Classify it. Place it in context.                │
│                                                                 │
│  2. WHY does it exist?                                          │
│     → What problem does it solve? What would break without it?  │
│                                                                 │
│  3. HOW does it work?                                           │
│     → Step-by-step logic. Every branch. Every edge case.        │
│                                                                 │
│  4. WHAT are its inputs?                                        │
│     → All forms. All sources. All validations.                  │
│                                                                 │
│  5. WHAT are its outputs?                                       │
│     → All forms. All destinations. All side effects.            │
│                                                                 │
│  6. WHAT depends on it? What does it depend on?                 │
│     → Upstream. Downstream. External.                           │
│                                                                 │
│  7. WHAT can go wrong?                                          │
│     → Errors. Failures. Edge cases. Recovery.                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

If you can answer all 7 questions for every element, the spec is complete.

Framework Flexibility

The output structure provided below is a starting point. When you encounter something that doesn't fit:

Don't force it into an existing category
Create a new category that accurately describes it
Apply the same 7 questions to document it
Add it to the spec with clear explanation

Examples of things that might not fit standard frameworks:

Custom DSLs or configuration languages
Unusual architectural patterns
Legacy code with implicit rules
Domain-specific algorithms
Complex state machines
Multi-system orchestration
Real-time/streaming logic
Hardware integrations

Document what IS, not what SHOULD BE.

Replicability Test

Every piece of documentation must answer: "Can someone implement this correctly using only this description?"

┌─────────────────────────────────────────────────────────────────┐
│                    Replicability Test                           │
├─────────────────────────────────────────────────────────────────┤
│  For each documented element, verify:                           │
│                                                                 │
│  □ Inputs: All possible input forms documented?                 │
│  □ Outputs: All possible output forms documented?               │
│  □ Logic: Complete decision tree with all branches?             │
│  □ Constraints: All validation rules and limits?                │
│  □ Edge cases: All boundary conditions handled?                 │
│  □ Dependencies: All upstream/downstream interactions?          │
│  □ Errors: All failure modes and handling strategies?           │
│                                                                 │
│  If any answer is "no", the documentation is incomplete.        │
└─────────────────────────────────────────────────────────────────┘

Workflow

Phase 0: Project Discovery

Before applying any framework, understand what the project actually is.

0.1 Initial Scan

# Run the discovery script to get project overview
bash scripts/discover_project.sh <project_root>

The script identifies:

Project type (backend API, frontend SPA, fullstack, library, CLI, etc.)
Tech stack (languages, frameworks, databases, external services)
Entry points and build system
Directory structure patterns

0.2 Scope Confirmation

Clarify with user:

Full project or specific modules?
Include deployment/infrastructure specs?
Target audience for the spec (same team? different team? different tech stack?)
Any known areas of complexity or legacy code?

Phase 1: Structural Decomposition

Decompose the project into documentable units. Do not assume a fixed structure—let the project's actual organization guide you.

1.1 Identify All Documentable Elements

Scan for and categorize everything that needs documentation:

Category	What to Find	How to Find
Configuration	Environment vars, config files, feature flags	`.env`, `config.`, `settings.`
Data Layer	Models, schemas, migrations, seeds	ORM definitions, SQL files, schema files
API Layer	Routes, controllers, middleware, auth	Route definitions, handler files
Business Logic	Services, use cases, domain rules	Service classes, domain modules
Integration	External APIs, queues, caches, storage	Client classes, adapter modules
UI Layer	Components, pages, state management	Component files, route configs
Infrastructure	Dockerfile, CI/CD, IaC	Deployment configs, pipeline files
Testing	Test files, fixtures, mocks	`_test.`, `.spec.`, `__tests__/`

1.2 Build the Element Inventory

Create a complete inventory before documenting. Use:

bash scripts/inventory_elements.sh <project_root> <output_file>

1.3 Handle Non-Standard Elements

If you find elements that don't fit the categories above:

Create a new category
Document the category's purpose
Apply the same replicability standards

The framework adapts to the project, not the other way around.

1.4 Document Discoveries and Surprises

Real projects contain surprises. These are often the MOST IMPORTANT things to document because they're the hardest to discover:

Discovery Type	Example	How to Document
Implicit Rules	"Orders over $1000 require manager approval" (found in code, not docs)	Create explicit business rule with source code reference
Hidden Dependencies	Service A secretly calls Service B for validation	Document in both A and B specs, explain the coupling
Legacy Workarounds	"We parse this field differently because of a 2019 bug"	Document the workaround AND the original bug context
Undocumented APIs	Internal endpoints used by admin tools	Full API spec, note "undocumented" status
Magic Numbers	`if (retries > 3)` - why 3?	Document the number AND reasoning if discoverable
Implicit Ordering	"This must run before that" with no explicit dependency	Document the ordering constraint explicitly
Environment-Specific Behavior	Different logic in prod vs dev	Document ALL environment variations
Technical Debt	"This should be refactored but works"	Document current behavior AND known issues

Key principle: If you had to figure something out by reading code, that means it wasn't documented. Document it now so the next person doesn't have to.

Phase 2: Deep Extraction

For each element in the inventory, extract complete specifications.

2.1 Extraction Template

Apply this template to every significant element:

## [Element Name]

### Source
- Source: path/to/file.ext
- Source: path/to/file.ext#SymbolName
- Source: path/to/file.ext:123

### Purpose
What problem does this solve? Why does it exist?

### Interface
- Inputs: [All input parameters, types, constraints, defaults]
- Outputs: [All output forms, types, structures]
- Side effects: [State changes, external calls, events emitted]

### Logic
[Complete decision tree or algorithm description]
- Step 1: ...
- Step 2: ...
- Conditions: if X then Y, else Z
- Loops: for each item in collection, do...

### Constraints
- Validation rules: [All input validation]
- Business rules: [Domain constraints]
- Technical limits: [Size limits, rate limits, timeouts]

### Dependencies
- Uses: [What this element calls/imports]
- Used by: [What calls/imports this element]
- External: [Third-party services, APIs]

### Error Handling
- Error conditions: [What can go wrong]
- Error responses: [How errors are communicated]
- Recovery: [Retry logic, fallbacks, cleanup]

### Edge Cases
- Empty/null inputs: [Handling]
- Maximum values: [Handling]
- Concurrent access: [Handling]
- [Any other edge cases specific to this element]

2.2 Extraction Checklist by Element Type

See references/extraction-checklist.md for detailed checklists for:

API endpoints
Database entities
Business logic modules
UI components
Background jobs
Integration adapters

Phase 3: Cross-Cutting Concerns

Document aspects that span multiple elements:

3.1 Data Flow

How data enters the system
How data transforms through layers
How data exits the system
Data validation checkpoints

3.2 Authentication & Authorization

Auth mechanisms (JWT, sessions, API keys, OAuth)
Permission model (RBAC, ABAC, custom)
Protected resources and access rules

3.3 Error Strategy

Global error handling patterns
Error code taxonomy
Logging and monitoring hooks
User-facing error messages

3.4 State Management

What state exists (DB, cache, session, UI state)
State consistency guarantees
State synchronization mechanisms

3.5 Performance Characteristics

Caching strategies
Query optimization patterns
Async/background processing
Rate limiting

Phase 4: Test Specifications

Extract testable specifications that serve as both documentation and validation criteria.

4.1 Unit Test Specs

For each function/method with business logic:

### Function: calculateDiscount(order, customer)

| Scenario | Input | Expected Output | Notes |
|----------|-------|-----------------|-------|
| New customer, small order | order.total=50, customer.isNew=true | 0 | Min threshold not met |
| New customer, large order | order.total=150, customer.isNew=true | 15 | 10% new customer discount |
| VIP customer | order.total=100, customer.tier='VIP' | 20 | 20% VIP discount |
| Combined discounts | order.total=200, customer.isNew=true, hasPromo=true | 30 | Max discount cap |
| Null order | null, customer | throws InvalidInputError | |

4.2 Integration Test Specs

For each module interaction:

### Integration: OrderService → PaymentGateway

| Scenario | Setup | Action | Verification |
|----------|-------|--------|--------------|
| Successful payment | Valid order, funded card | processPayment() | Order status = PAID, payment record created |
| Declined card | Valid order, declined card | processPayment() | Order status = PAYMENT_FAILED, retry scheduled |
| Gateway timeout | Valid order, slow gateway | processPayment() | Timeout after 30s, order status = PENDING |

4.3 E2E Test Specs

For each critical user journey:

### Journey: User completes purchase

Steps:
1. User adds item to cart → Cart shows 1 item
2. User proceeds to checkout → Checkout page loads with cart summary
3. User enters shipping info → Shipping options displayed
4. User selects payment method → Payment form shown
5. User submits order → Order confirmation displayed, email sent

Variations:
- Guest checkout vs logged-in user
- Single item vs multiple items
- Standard vs express shipping

Phase 5: Verification & Validation

5.1 Completeness Check

Use scripts to assist discovery and gap-finding. These checks are heuristic—they help you find likely missing documentation or broken links, but they do not “prove” completeness.

Run bidirectional verification:

# Forward: Does every code element have spec coverage?
bash scripts/verify_coverage.sh <project_root> <spec_output>

# Spec → Code: Verify spec anchors map to code (requires Source: lines in your spec).
bash scripts/verify_implementation.sh <spec_output> <project_root>

Source anchors (recommended)

To make Spec → Code verification practical, add Source: lines to your spec markdown. Supported formats:

Source: relative/path/to/file.ext
Source: relative/path/to/file.ext#SymbolName (best-effort text match)
Source: relative/path/to/file.ext:123 (line existence check)
Source: relative/path/to/file.ext:123#SymbolName

5.2 Replication Test

The ultimate validation—can the spec be used to replicate the project?

## Replication Verification Protocol

1. Provide spec to a fresh Claude instance (no access to original code)
2. Ask it to implement [specific module] using only the spec
3. Compare:
   - Does it handle all documented inputs correctly?
   - Does it produce all documented outputs correctly?
   - Does it handle all documented edge cases?
   - Does it follow all documented constraints?

4. Any discrepancy = spec is incomplete. Update and repeat.

5.3 Diff Report

Generate a report of gaps:

## Spec Completeness Report

### Documented and Verified
- [List of elements with complete specs]

### Documented but Unverified
- [List of elements needing verification]

### Found in Code but Not Documented
- [List of elements missing from spec]

### Ambiguous or Unclear
- [List of elements needing clarification]

Output Structure

Default output structure (adapt based on project type):

spec/
├── 00_Overview/
│   ├── PROJECT.md              # Identity, tech stack, architecture overview
│   ├── ARCHITECTURE.md         # System design, module relationships
│   ├── GLOSSARY.md             # Domain terms and definitions
│   └── diagrams/               # Architecture diagrams (Mermaid/PlantUML)
│
├── 01_Configuration/
│   ├── ENVIRONMENT.md          # All environment variables
│   ├── FEATURE_FLAGS.md        # Feature toggles and their effects
│   └── schemas/                # Config file schemas (JSON Schema, etc.)
│
├── 02_Data/
│   ├── ENTITIES.md             # All data entities with full field specs
│   ├── RELATIONSHIPS.md        # Entity relationships, constraints
│   ├── MIGRATIONS.md           # Schema evolution history
│   └── schemas/                # Executable schema definitions
│
├── 03_API/
│   ├── ENDPOINTS.md            # All API endpoints
│   ├── AUTHENTICATION.md       # Auth mechanisms and flows
│   ├── ERRORS.md               # Error codes and responses
│   └── openapi/                # OpenAPI specs if applicable
│
├── 04_Business_Logic/
│   ├── RULES.md                # Business rules catalog
│   ├── WORKFLOWS.md            # Multi-step processes
│   ├── STATE_MACHINES.md       # State transitions
│   └── CALCULATIONS.md         # Formulas and algorithms
│
├── 05_Integrations/
│   ├── EXTERNAL_APIS.md        # Third-party API interactions
│   ├── MESSAGING.md            # Queue/event interactions
│   └── STORAGE.md              # File/blob storage interactions
│
├── 06_UI/ (if applicable)
│   ├── COMPONENTS.md           # UI component catalog
│   ├── PAGES.md                # Page structures and routing
│   ├── STATE.md                # Frontend state management
│   └── INTERACTIONS.md         # User interaction patterns
│
├── 07_Infrastructure/
│   ├── DEPLOYMENT.md           # Deployment architecture
│   ├── SCALING.md              # Scaling strategies
│   └── MONITORING.md           # Observability setup
│
├── 08_Testing/
│   ├── UNIT_SPECS.md           # Unit test specifications
│   ├── INTEGRATION_SPECS.md    # Integration test specifications
│   ├── E2E_SPECS.md            # End-to-end test specifications
│   └── test-cases/             # Executable test case files
│
├── 09_Verification/
│   ├── COVERAGE_REPORT.md      # Spec coverage analysis
│   ├── REPLICATION_GUIDE.md    # How to replicate from spec
│   └── KNOWN_GAPS.md           # Documented limitations
│
└── SPEC_INDEX.md               # Master index of all spec documents

Important: This structure is a starting point. Add, remove, or reorganize sections based on what the project actually contains.

Resources

Scripts:

scripts/discover_project.sh - Initial project scanning and classification
scripts/inventory_elements.sh - Generate element inventory from codebase
scripts/verify_coverage.sh - Check spec completeness against code
scripts/verify_implementation.sh - Validate spec Source anchors map to code
scripts/generate_skeleton.sh - Create output directory structure

References:

references/extraction-checklist.md - Detailed checklists by element type
references/replicability-criteria.md - Standards for spec quality

AI/ML Systems

If the project includes AI/ML components (LLM integrations, ML models, AI agents, RAG systems), document extra details required for replicability:

Exact prompts and templates (not summaries)
Model/provider identifiers and versions
Context management (inputs, tool calls, memory, truncation rules)
Retrieval (chunking, embedding model, ranking, filters)
Evaluation datasets and pass/fail criteria

Key principle: AI systems require extra documentation because they are non-deterministic. Document exact prompts (not summaries), model versions, and evaluation data.

Critical Reminders

Framework is a guide, not a constraint: If the project has elements outside the framework, extend the framework.
Replicability is the only measure: Every spec element must enable implementation without seeing original code.
Verify bidirectionally: Code → Spec and Spec → Code. Both directions must be complete.
Document the unexpected: Legacy code, workarounds, technical debt, and "why" decisions are often the most valuable documentation.
Test specs are part of the spec: If you can't write a test case for it, you haven't documented it well enough.
AI systems need extra detail: Prompts, model versions, and evaluation data are essential for replication.

Related skills

More from okwinds/miscellany

Installs

Repository

okwinds/miscellany

GitHub Stars

First Seen

Feb 6, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass