Deep Dive Analysis Skill

Overview

This skill combines mechanical structure extraction with Claude's semantic understanding to produce comprehensive codebase documentation. Unlike simple AST parsing, this skill captures:

WHAT the code does (structure, functions, classes)
WHY it exists (business purpose, design decisions)
HOW it integrates (dependencies, contracts, flows)
CONSEQUENCES of changes (side effects, failure modes)

Capabilities

Mechanical Analysis (Scripts):

Extract code structure (classes, functions, imports)
Map dependencies (internal/external)
Find symbol usages across the codebase
Track analysis progress
Classify files by criticality

Semantic Analysis (Claude AI):

Recognize architectural and design patterns
Identify red flags and anti-patterns
Trace data and control flows
Document contracts and invariants
Assess quality and maintainability

Documentation Maintenance:

Review and maintain documentation (Phase 8)
Fix broken links and update navigation indexes
Analyze and rewrite code comments (antirez standards)

Use this skill when:

Analyzing a codebase you're unfamiliar with
Generating documentation that explains WHY, not just WHAT
Identifying architectural patterns and anti-patterns
Performing code review with semantic understanding
Onboarding to a new project

Prerequisites

This skill is invoked by the /deep-dive-analysis command. The command creates and manages state automatically in .deep-dive/ under the target directory:

.deep-dive/state.json -- phase tracking (auto-created by the command)
.deep-dive/<phase-number>-<name>.md -- per-phase output documents

The legacy standalone flow using analysis_progress.json and DEEP_DIVE_PLAN.md at project root is no longer the primary path -- prefer invoking /deep-dive-analysis <target>.

CRITICAL PRINCIPLE: ABSOLUTE SOURCE OF TRUTH

THE DOCUMENTATION GENERATED BY THIS SKILL IS THE ABSOLUTE AND UNQUESTIONABLE SOURCE OF TRUTH FOR YOUR PROJECT.

ANY INFORMATION NOT VERIFIED WITH IRREFUTABLE EVIDENCE FROM SOURCE CODE IS FALSE, UNRELIABLE, AND UNACCEPTABLE.

Mandatory Rules (VIOLATION = FAILURE)

NEVER document anything without reading the actual source code first
NEVER assume any existing documentation, comment, or docstring is accurate
NEVER write documentation based on memory, inference, or "what should be"
ALWAYS derive truth EXCLUSIVELY from reading and tracing actual code
ALWAYS provide source file + qualified symbol name for every technical claim
ALWAYS verify state machines, enums, constants against actual definitions
TREAT all pre-existing docs as unverified claims requiring validation
MARK any unverifiable statement as [UNVERIFIED - REQUIRES CODE CHECK]
USE qualified symbol names in markers (file.py::Class.method), never line numbers -- line numbers break on any edit

See references/analysis-templates.md for the full verification trust model, temporal purity principle, and documentation status markers.

Output Usage Guide

After analysis completes, consult the right file for your task:

Your Task	Start With	Also Check
Onboarding / understanding the project	07-final-report, 01-structure	04-semantics
Writing new feature	01-structure (Where to Add), 02-interfaces	04-semantics
Fixing a bug	03-flows, 05-risks	01-structure
Refactoring	01-structure, 04-semantics, 05-risks	03-flows
Code review	02-interfaces, 05-risks	06-documentation
Updating documentation	06-documentation, 04-semantics	02-interfaces

Forbidden Files

The analysis NEVER reads or includes contents from sensitive files: .env, .env.*, credentials.*, secrets.*, *.pem, *.key, *.p12, *.pfx, id_rsa*, id_ed25519*, .npmrc, .pypirc, .netrc, or any file containing API keys, passwords, or tokens. If encountered, note file existence only - never quote contents.

Available Commands

1. Analyze Single File

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --file src/utils/circuit_breaker.py \
  --output-format markdown

Parameters:

--file / -f: Relative path to file - REQUIRED
--output-format / -o: Output format (json, markdown, summary) - default: summary
--find-usages / -u: Find all usages of exported symbols - default: false
--update-progress / -p: Update analysis_progress.json - default: false

2. Check Progress

python .claude/skills/deep-dive-analysis/scripts/check_progress.py \
  --phase 1 --status pending

3. Find Usages

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --symbol CircuitBreaker --file src/utils/circuit_breaker.py

4. Generate Phase Report

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --phase 1 --output-format markdown --output-file docs/01_domains/COMMON_LIBRARY.md

Phase 8: Documentation Review Commands

5. Scan Documentation Health

python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan \
  --path docs/ --output doc_health_report.json

6. Validate Links

python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links \
  --path docs/ --fix

7. Verify Against Source Code

python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify \
  --doc docs/agents/lifecycle.md --source src/agents/lifecycle.py

8. Update Navigation Indexes

python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes \
  --search-index docs/00_navigation/SEARCH_INDEX.md \
  --by-domain docs/00_navigation/BY_DOMAIN.md

9. Full Documentation Maintenance

python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance \
  --path docs/ --auto-fix --output doc_health_report.json

Executes: scan health, validate/fix links, identify obsolete files, update indexes, generate report.

Comment Quality Commands (Antirez Standards)

10. Analyze Comment Quality

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze \
  src/main.py --report

11. Scan Directory for Comment Issues

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan \
  src/ --recursive --issues-only

12. Generate Comment Health Report

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report \
  src/ --output comment_health.md

13. Rewrite Comments

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite \
  src/main.py --apply --backup

14. View Standards Reference

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards

File Classification Criteria

Classification	Criteria	Verification
Critical	Handles authentication, security, encryption, sensitive data	Mandatory
High-Complexity	>300 LOC, >5 dependencies, state machines, async patterns	Mandatory
Standard	Normal business logic, data models, utilities	Recommended
Utility	Pure functions, helpers, constants	Optional

AI-Powered Semantic Analysis

Five Layers of Understanding

Layer	What	Who Does It
1. WHAT	Classes, functions, imports	Scripts (AST)
2. HOW	Algorithm details, data flow	Claude's first pass
3. WHY	Business purpose, design decisions	Claude's deep analysis
4. WHEN	Triggers, lifecycle, concurrency	Claude's behavioral analysis
5. CONSEQUENCES	Side effects, failure modes	Claude's systems thinking

Pattern Recognition

Pattern Type	Examples	Documentation Focus
Architectural	Repository, Service, CQRS, Event-Driven	Responsibilities, boundaries
Behavioral	State Machine, Strategy, Observer, Chain	Transitions, variations
Resilience	Circuit Breaker, Retry, Bulkhead, Timeout	Thresholds, fallbacks
Data	DTO, Value Object, Aggregate	Invariants, relationships
Concurrency	Producer-Consumer, Worker Pool	Thread safety, backpressure

Red Flags to Identify

ARCHITECTURE:
- GOD CLASS: >10 public methods or >500 LOC
- CIRCULAR DEPENDENCY: A -> B -> C -> A
- LEAKY ABSTRACTION: Implementation details in interface

RELIABILITY:
- SWALLOWED EXCEPTION: Empty catch blocks
- MISSING TIMEOUT: Network calls without timeout
- RACE CONDITION: Shared mutable state without sync

SECURITY:
- HARDCODED SECRET: Passwords, API keys in code
- SQL INJECTION: String concatenation in queries
- MISSING VALIDATION: Unsanitized user input

AI Analysis Workflow

1. SCRIPTS RUN FIRST -> classifier.py, ast_parser.py, usage_finder.py
2. CLAUDE ANALYZES -> Read source, apply semantic questions, recognize patterns, identify red flags
3. CLAUDE DOCUMENTS -> Use template, explain WHY not just WHAT, document contracts
4. VERIFY -> Check against runtime behavior, validate with code traces

Analysis Loop Workflow

1. CLASSIFY -> LOC, dependencies, critical patterns, assign classification
2. READ & MAP -> AST structure, classes, functions, constants, state mutations
3. DEPENDENCY CHECK -> Internal imports, external imports, external calls
4. CONTEXT ANALYSIS -> Symbol usages, importing modules, message flows
5. RUNTIME VERIFICATION (Critical/High-Complexity) -> Log analysis, flow verification
6. DOCUMENTATION -> Update progress, generate report, cross-reference

Best Practices

Source Code Analysis (Phases 1-7)

Start with Phase 1 - foundation modules inform everything else
Track progress with --update-progress
Never skip runtime verification for critical/high-complexity files
Cross-reference with CONTEXT.md after analysis

Documentation Maintenance (Phase 8)

Run scan first to understand current state
Fix links before content - broken links indicate structural issues
Verify against code before updating documentation
Update indexes last to reflect final state

References

references/analysis-templates.md - Verification trust model, temporal purity principle, documentation status markers, comment classification, maintenance workflows
references/AI_ANALYSIS_METHODOLOGY.md - Complete analysis methodology
references/SEMANTIC_PATTERNS.md - Pattern recognition guide
references/ANTIREZ_COMMENTING_STANDARDS.md - Comment taxonomy
references/DEEP_DIVE_PLAN.md - Master analysis plan with all phase definitions
templates/semantic_analysis.md - AI-powered per-file analysis template
templates/analysis_report.md - Module-level report template

Resources

Scripts: scripts/ - Python analysis tools
- analyze_file.py - Source code analysis (Phases 1-7)
- check_progress.py - Progress tracking
- doc_review.py - Documentation maintenance (Phase 8)
- comment_rewriter.py - Comment analysis engine
- rewrite_comments.py - Comment quality CLI tool

deep-dive-analysis