deep-dive-analysis
Deep Dive Analysis Skill
Overview
This skill combines mechanical structure extraction with Claude's semantic understanding to produce comprehensive codebase documentation. Unlike simple AST parsing, this skill captures:
- WHAT the code does (structure, functions, classes)
- WHY it exists (business purpose, design decisions)
- HOW it integrates (dependencies, contracts, flows)
- CONSEQUENCES of changes (side effects, failure modes)
Capabilities
Mechanical Analysis (Scripts):
- Extract code structure (classes, functions, imports)
- Map dependencies (internal/external)
- Find symbol usages across the codebase
- Track analysis progress
- Classify files by criticality
Semantic Analysis (Claude AI):
- Recognize architectural and design patterns
- Identify red flags and anti-patterns
- Trace data and control flows
- Document contracts and invariants
- Assess quality and maintainability
Documentation Maintenance:
- Review and maintain documentation (Phase 8)
- Fix broken links and update navigation indexes
- Analyze and rewrite code comments (antirez standards)
Use this skill when:
- Analyzing a codebase you're unfamiliar with
- Generating documentation that explains WHY, not just WHAT
- Identifying architectural patterns and anti-patterns
- Performing code review with semantic understanding
- Onboarding to a new project
Prerequisites
- analysis_progress.json must exist in project root (created by DEEP_DIVE_PLAN setup)
- DEEP_DIVE_PLAN.md should be reviewed to understand phase structure
CRITICAL PRINCIPLE: ABSOLUTE SOURCE OF TRUTH
THE DOCUMENTATION GENERATED BY THIS SKILL IS THE ABSOLUTE AND UNQUESTIONABLE SOURCE OF TRUTH FOR YOUR PROJECT.
ANY INFORMATION NOT VERIFIED WITH IRREFUTABLE EVIDENCE FROM SOURCE CODE IS FALSE, UNRELIABLE, AND UNACCEPTABLE.
Mandatory Rules (VIOLATION = FAILURE)
- NEVER document anything without reading the actual source code first
- NEVER assume any existing documentation, comment, or docstring is accurate
- NEVER write documentation based on memory, inference, or "what should be"
- ALWAYS derive truth EXCLUSIVELY from reading and tracing actual code
- ALWAYS provide source file + line number for every technical claim
- ALWAYS verify state machines, enums, constants against actual definitions
- TREAT all pre-existing docs as unverified claims requiring validation
- MARK any unverifiable statement as
[UNVERIFIED - REQUIRES CODE CHECK]
See references/analysis-templates.md for the full verification trust model, temporal purity principle, and documentation status markers.
Output Usage Guide
After analysis completes, consult the right file for your task:
| Your Task | Start With | Also Check |
|---|---|---|
| Onboarding / understanding the project | 07-final-report, 01-structure | 04-semantics |
| Writing new feature | 01-structure (Where to Add), 02-interfaces | 04-semantics |
| Fixing a bug | 03-flows, 05-risks | 01-structure |
| Refactoring | 01-structure, 04-semantics, 05-risks | 03-flows |
| Code review | 02-interfaces, 05-risks | 06-documentation |
| Updating documentation | 06-documentation, 04-semantics | 02-interfaces |
Forbidden Files
The analysis NEVER reads or includes contents from sensitive files: .env, .env.*, credentials.*, secrets.*, *.pem, *.key, *.p12, *.pfx, id_rsa*, id_ed25519*, .npmrc, .pypirc, .netrc, or any file containing API keys, passwords, or tokens. If encountered, note file existence only - never quote contents.
Available Commands
1. Analyze Single File
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--file src/utils/circuit_breaker.py \
--output-format markdown
Parameters:
--file/-f: Relative path to file - REQUIRED--output-format/-o: Output format (json, markdown, summary) - default: summary--find-usages/-u: Find all usages of exported symbols - default: false--update-progress/-p: Update analysis_progress.json - default: false
2. Check Progress
python .claude/skills/deep-dive-analysis/scripts/check_progress.py \
--phase 1 --status pending
3. Find Usages
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--symbol CircuitBreaker --file src/utils/circuit_breaker.py
4. Generate Phase Report
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--phase 1 --output-format markdown --output-file docs/01_domains/COMMON_LIBRARY.md
Phase 8: Documentation Review Commands
5. Scan Documentation Health
python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan \
--path docs/ --output doc_health_report.json
6. Validate Links
python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links \
--path docs/ --fix
7. Verify Against Source Code
python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify \
--doc docs/agents/lifecycle.md --source src/agents/lifecycle.py
8. Update Navigation Indexes
python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes \
--search-index docs/00_navigation/SEARCH_INDEX.md \
--by-domain docs/00_navigation/BY_DOMAIN.md
9. Full Documentation Maintenance
python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance \
--path docs/ --auto-fix --output doc_health_report.json
Executes: scan health, validate/fix links, identify obsolete files, update indexes, generate report.
Comment Quality Commands (Antirez Standards)
10. Analyze Comment Quality
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze \
src/main.py --report
11. Scan Directory for Comment Issues
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan \
src/ --recursive --issues-only
12. Generate Comment Health Report
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report \
src/ --output comment_health.md
13. Rewrite Comments
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite \
src/main.py --apply --backup
14. View Standards Reference
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards
File Classification Criteria
| Classification | Criteria | Verification |
|---|---|---|
| Critical | Handles authentication, security, encryption, sensitive data | Mandatory |
| High-Complexity | >300 LOC, >5 dependencies, state machines, async patterns | Mandatory |
| Standard | Normal business logic, data models, utilities | Recommended |
| Utility | Pure functions, helpers, constants | Optional |
AI-Powered Semantic Analysis
Five Layers of Understanding
| Layer | What | Who Does It |
|---|---|---|
| 1. WHAT | Classes, functions, imports | Scripts (AST) |
| 2. HOW | Algorithm details, data flow | Claude's first pass |
| 3. WHY | Business purpose, design decisions | Claude's deep analysis |
| 4. WHEN | Triggers, lifecycle, concurrency | Claude's behavioral analysis |
| 5. CONSEQUENCES | Side effects, failure modes | Claude's systems thinking |
Pattern Recognition
| Pattern Type | Examples | Documentation Focus |
|---|---|---|
| Architectural | Repository, Service, CQRS, Event-Driven | Responsibilities, boundaries |
| Behavioral | State Machine, Strategy, Observer, Chain | Transitions, variations |
| Resilience | Circuit Breaker, Retry, Bulkhead, Timeout | Thresholds, fallbacks |
| Data | DTO, Value Object, Aggregate | Invariants, relationships |
| Concurrency | Producer-Consumer, Worker Pool | Thread safety, backpressure |
Red Flags to Identify
ARCHITECTURE:
- GOD CLASS: >10 public methods or >500 LOC
- CIRCULAR DEPENDENCY: A -> B -> C -> A
- LEAKY ABSTRACTION: Implementation details in interface
RELIABILITY:
- SWALLOWED EXCEPTION: Empty catch blocks
- MISSING TIMEOUT: Network calls without timeout
- RACE CONDITION: Shared mutable state without sync
SECURITY:
- HARDCODED SECRET: Passwords, API keys in code
- SQL INJECTION: String concatenation in queries
- MISSING VALIDATION: Unsanitized user input
AI Analysis Workflow
1. SCRIPTS RUN FIRST -> classifier.py, ast_parser.py, usage_finder.py
2. CLAUDE ANALYZES -> Read source, apply semantic questions, recognize patterns, identify red flags
3. CLAUDE DOCUMENTS -> Use template, explain WHY not just WHAT, document contracts
4. VERIFY -> Check against runtime behavior, validate with code traces
Analysis Loop Workflow
1. CLASSIFY -> LOC, dependencies, critical patterns, assign classification
2. READ & MAP -> AST structure, classes, functions, constants, state mutations
3. DEPENDENCY CHECK -> Internal imports, external imports, external calls
4. CONTEXT ANALYSIS -> Symbol usages, importing modules, message flows
5. RUNTIME VERIFICATION (Critical/High-Complexity) -> Log analysis, flow verification
6. DOCUMENTATION -> Update progress, generate report, cross-reference
Best Practices
Source Code Analysis (Phases 1-7)
- Start with Phase 1 - foundation modules inform everything else
- Track progress with
--update-progress - Never skip runtime verification for critical/high-complexity files
- Cross-reference with CONTEXT.md after analysis
Documentation Maintenance (Phase 8)
- Run scan first to understand current state
- Fix links before content - broken links indicate structural issues
- Verify against code before updating documentation
- Update indexes last to reflect final state
References
references/analysis-templates.md- Verification trust model, temporal purity principle, documentation status markers, comment classification, maintenance workflowsreferences/AI_ANALYSIS_METHODOLOGY.md- Complete analysis methodologyreferences/SEMANTIC_PATTERNS.md- Pattern recognition guidereferences/ANTIREZ_COMMENTING_STANDARDS.md- Comment taxonomyreferences/DEEP_DIVE_PLAN.md- Master analysis plan with all phase definitionstemplates/semantic_analysis.md- AI-powered per-file analysis templatetemplates/analysis_report.md- Module-level report template
Resources
- Scripts:
scripts/- Python analysis toolsanalyze_file.py- Source code analysis (Phases 1-7)check_progress.py- Progress trackingdoc_review.py- Documentation maintenance (Phase 8)comment_rewriter.py- Comment analysis enginerewrite_comments.py- Comment quality CLI tool