code-summarizer
Code Summarizer
Generate clear, concise summaries of source code at any scale - from individual functions to entire codebases.
Overview
This skill helps analyze and summarize code by adapting the level of detail to the code's scale:
- Small-scale code (functions, classes, small files): Provide focused summaries of intention and implementation
- Large-scale code (modules, packages, entire repositories): Provide hierarchical summaries with progressive drill-down
Workflow Decision Tree
User provides code → Assess scale
├─ Small-scale (< 200 lines, single file/function)
│ └─ Generate focused summary
│
└─ Large-scale (> 200 lines, multiple files/modules)
├─ Generate high-level overview
├─ List main modules/components
└─ Prompt user to select specific parts for detailed analysis
Small-Scale Code Summarization
For functions, classes, or small files (typically < 200 lines), provide a focused summary that includes:
Summary Structure
-
Purpose Statement (1-2 sentences)
- What does this code do?
- What problem does it solve?
-
Core Logic (2-4 bullet points)
- Key algorithms or approaches used
- Important data transformations
- Critical control flow decisions
-
Key Details
- Input parameters and their purposes
- Return values and their meaning
- Important side effects or state changes
- Dependencies on external libraries or modules
-
Notable Patterns (if applicable)
- Design patterns used
- Optimization techniques
- Error handling approaches
Example Format
## Summary
**Purpose**: This function validates user email addresses and normalizes them to lowercase format before database storage.
**Core Logic**:
- Uses regex pattern matching to validate email format (RFC 5322 compliant)
- Strips whitespace and converts to lowercase for consistency
- Checks against a blocklist of disposable email domains
- Logs validation failures for security monitoring
**Key Details**:
- Input: `email` (string) - raw email address from user input
- Returns: `normalized_email` (string) or raises `ValidationError`
- Side effect: Logs to `security.log` on validation failure
- Dependencies: `re`, `logging`, custom `EmailBlocklist` class
**Notable Patterns**:
- Uses early return pattern for validation failures
- Implements defensive programming with input sanitization
Large-Scale Code Summarization
For modules, packages, or entire repositories (typically > 200 lines or multiple files), use a hierarchical approach:
Phase 1: High-Level Overview
Provide a concise overview that includes:
-
Project Purpose (2-3 sentences)
- What does this codebase do?
- What is its primary use case or domain?
-
Architecture Overview
- Overall design pattern (MVC, microservices, layered, etc.)
- Key architectural decisions
- Technology stack
-
Main Components (list with brief descriptions)
- List 5-10 major modules/packages
- One-line description for each
- Indicate relationships between components
-
Entry Points
- Main execution files
- Key API endpoints or interfaces
- Configuration files
Phase 2: Interactive Drill-Down
After providing the overview, prompt the user to select specific areas for detailed analysis:
## Detailed Analysis Available
I can provide more detailed summaries of specific components:
1. **[Component Name]** - [Brief description]
2. **[Component Name]** - [Brief description]
3. **[Component Name]** - [Brief description]
...
Which component(s) would you like me to analyze in detail? You can:
- Select one or more by number
- Ask about specific functionality (e.g., "How does authentication work?")
- Request a specific file or module by name
Phase 3: Detailed Component Analysis
When user selects a component, provide a detailed summary using the small-scale format adapted for the component:
- Purpose and responsibilities
- Key classes/functions within the component
- Interactions with other components
- Important algorithms or business logic
- Configuration and dependencies
Best Practices
Code Analysis Approach
-
Read strategically
- Start with entry points (main files, init.py, index files)
- Examine directory structure for organization patterns
- Look for README, documentation, or comments
- Identify configuration files
-
Identify patterns
- Recognize common design patterns
- Note architectural styles
- Identify framework conventions
-
Focus on intent over implementation
- Explain what and why before how
- Highlight business logic over boilerplate
- Emphasize key algorithms over routine operations
Writing Style
- Be concise: Avoid unnecessary verbosity
- Be specific: Use concrete examples and actual names from the code
- Be hierarchical: Start broad, then drill down
- Be actionable: Help users understand how to use or modify the code
Handling Different Languages
Adapt terminology and patterns to the language:
- Python: Modules, packages, decorators, list comprehensions
- JavaScript: Modules, components, promises, async/await
- Java: Packages, classes, interfaces, annotations
- C/C++: Headers, source files, namespaces, templates
- Go: Packages, goroutines, channels, interfaces
Common Scenarios
Scenario 1: Understanding a New Codebase
User: "Can you summarize this repository?"
Response approach:
- Analyze directory structure
- Read main entry points and README
- Provide high-level overview with component list
- Offer to drill down into specific areas
Scenario 2: Explaining a Specific Function
User: "What does this function do?" [provides code]
Response approach:
- Identify function purpose
- Explain core logic step-by-step
- Note inputs, outputs, and side effects
- Highlight any notable patterns or concerns
Scenario 3: Comparing Implementations
User: "Summarize these two implementations and compare them"
Response approach:
- Summarize each implementation separately
- Identify key differences in approach
- Compare trade-offs (performance, readability, maintainability)
- Recommend based on context if appropriate
Scenario 4: Legacy Code Understanding
User: "Help me understand this old code"
Response approach:
- Identify the era/style of the code
- Explain outdated patterns or conventions
- Summarize what it does in modern terms
- Suggest modern equivalents if relevant
Output Format Guidelines
For Small-Scale Code
Use clear markdown with:
- Heading for the summary
- Bullet points for core logic
- Code blocks for examples if helpful
- Bold for emphasis on key terms
For Large-Scale Code
Use structured markdown with:
- Clear section headings
- Numbered or bulleted lists for components
- Tables for comparing multiple items
- Collapsible sections for optional details (if supported)
Code References
When referencing specific code elements:
- Use
backticksfor function/class/variable names - Include file paths when relevant:
src/utils/validator.py:validate_email() - Use line numbers for large files:
lines 45-67
Limitations and Considerations
- Context limits: For very large codebases, may need to analyze in chunks
- Missing context: May need to ask clarifying questions about business logic
- Language expertise: Summaries are most accurate for well-known languages and frameworks
- Dynamic behavior: Cannot fully analyze runtime behavior without execution
- External dependencies: May not have full context for third-party libraries
When encountering limitations, acknowledge them and offer alternative approaches or ask for additional context.