Code Summarizer

Generate clear, concise summaries of source code at any scale - from individual functions to entire codebases.

Overview

This skill helps analyze and summarize code by adapting the level of detail to the code's scale:

Small-scale code (functions, classes, small files): Provide focused summaries of intention and implementation
Large-scale code (modules, packages, entire repositories): Provide hierarchical summaries with progressive drill-down

Workflow Decision Tree

User provides code → Assess scale
    ├─ Small-scale (< 200 lines, single file/function)
    │   └─ Generate focused summary
    │
    └─ Large-scale (> 200 lines, multiple files/modules)
        ├─ Generate high-level overview
        ├─ List main modules/components
        └─ Prompt user to select specific parts for detailed analysis

Small-Scale Code Summarization

For functions, classes, or small files (typically < 200 lines), provide a focused summary that includes:

Summary Structure

Purpose Statement (1-2 sentences)
- What does this code do?
- What problem does it solve?
Core Logic (2-4 bullet points)
- Key algorithms or approaches used
- Important data transformations
- Critical control flow decisions
Key Details
- Input parameters and their purposes
- Return values and their meaning
- Important side effects or state changes
- Dependencies on external libraries or modules
Notable Patterns (if applicable)
- Design patterns used
- Optimization techniques
- Error handling approaches

Example Format

## Summary

**Purpose**: This function validates user email addresses and normalizes them to lowercase format before database storage.

**Core Logic**:
- Uses regex pattern matching to validate email format (RFC 5322 compliant)
- Strips whitespace and converts to lowercase for consistency
- Checks against a blocklist of disposable email domains
- Logs validation failures for security monitoring

**Key Details**:
- Input: `email` (string) - raw email address from user input
- Returns: `normalized_email` (string) or raises `ValidationError`
- Side effect: Logs to `security.log` on validation failure
- Dependencies: `re`, `logging`, custom `EmailBlocklist` class

**Notable Patterns**:
- Uses early return pattern for validation failures
- Implements defensive programming with input sanitization

Large-Scale Code Summarization

For modules, packages, or entire repositories (typically > 200 lines or multiple files), use a hierarchical approach:

Phase 1: High-Level Overview

Provide a concise overview that includes:

Project Purpose (2-3 sentences)
- What does this codebase do?
- What is its primary use case or domain?
Architecture Overview
- Overall design pattern (MVC, microservices, layered, etc.)
- Key architectural decisions
- Technology stack
Main Components (list with brief descriptions)
- List 5-10 major modules/packages
- One-line description for each
- Indicate relationships between components
Entry Points
- Main execution files
- Key API endpoints or interfaces
- Configuration files

Phase 2: Interactive Drill-Down

After providing the overview, prompt the user to select specific areas for detailed analysis:

## Detailed Analysis Available

I can provide more detailed summaries of specific components:

1. **[Component Name]** - [Brief description]
2. **[Component Name]** - [Brief description]
3. **[Component Name]** - [Brief description]
...

Which component(s) would you like me to analyze in detail? You can:
- Select one or more by number
- Ask about specific functionality (e.g., "How does authentication work?")
- Request a specific file or module by name

Phase 3: Detailed Component Analysis

When user selects a component, provide a detailed summary using the small-scale format adapted for the component:

Purpose and responsibilities
Key classes/functions within the component
Interactions with other components
Important algorithms or business logic
Configuration and dependencies

Best Practices

Code Analysis Approach

Read strategically
- Start with entry points (main files, init.py, index files)
- Examine directory structure for organization patterns
- Look for README, documentation, or comments
- Identify configuration files
Identify patterns
- Recognize common design patterns
- Note architectural styles
- Identify framework conventions
Focus on intent over implementation
- Explain what and why before how
- Highlight business logic over boilerplate
- Emphasize key algorithms over routine operations

Writing Style

Be concise: Avoid unnecessary verbosity
Be specific: Use concrete examples and actual names from the code
Be hierarchical: Start broad, then drill down
Be actionable: Help users understand how to use or modify the code

Handling Different Languages

Adapt terminology and patterns to the language:

Python: Modules, packages, decorators, list comprehensions
JavaScript: Modules, components, promises, async/await
Java: Packages, classes, interfaces, annotations
C/C++: Headers, source files, namespaces, templates
Go: Packages, goroutines, channels, interfaces

Common Scenarios

Scenario 1: Understanding a New Codebase

User: "Can you summarize this repository?"

Response approach:

Analyze directory structure
Read main entry points and README
Provide high-level overview with component list
Offer to drill down into specific areas

Scenario 2: Explaining a Specific Function

User: "What does this function do?" [provides code]

Response approach:

Identify function purpose
Explain core logic step-by-step
Note inputs, outputs, and side effects
Highlight any notable patterns or concerns

Scenario 3: Comparing Implementations

User: "Summarize these two implementations and compare them"

Response approach:

Summarize each implementation separately
Identify key differences in approach
Compare trade-offs (performance, readability, maintainability)
Recommend based on context if appropriate

Scenario 4: Legacy Code Understanding

User: "Help me understand this old code"

Response approach:

Identify the era/style of the code
Explain outdated patterns or conventions
Summarize what it does in modern terms
Suggest modern equivalents if relevant

Output Format Guidelines

For Small-Scale Code

Use clear markdown with:

Heading for the summary
Bullet points for core logic
Code blocks for examples if helpful
Bold for emphasis on key terms

For Large-Scale Code

Use structured markdown with:

Clear section headings
Numbered or bulleted lists for components
Tables for comparing multiple items
Collapsible sections for optional details (if supported)

Code References

When referencing specific code elements:

Use backticks for function/class/variable names
Include file paths when relevant: src/utils/validator.py:validate_email()
Use line numbers for large files: lines 45-67

Limitations and Considerations

Context limits: For very large codebases, may need to analyze in chunks
Missing context: May need to ask clarifying questions about business logic
Language expertise: Summaries are most accurate for well-known languages and frameworks
Dynamic behavior: Cannot fully analyze runtime behavior without execution
External dependencies: May not have full context for third-party libraries

When encountering limitations, acknowledge them and offer alternative approaches or ask for additional context.

code-summarizer

Code Summarizer

Overview

Workflow Decision Tree

Small-Scale Code Summarization

Summary Structure

Example Format

Large-Scale Code Summarization

Phase 1: High-Level Overview

Phase 2: Interactive Drill-Down

Phase 3: Detailed Component Analysis

Best Practices

Code Analysis Approach

Writing Style

Handling Different Languages

Common Scenarios

Scenario 1: Understanding a New Codebase

Scenario 2: Explaining a Specific Function

Scenario 3: Comparing Implementations

Scenario 4: Legacy Code Understanding

Output Format Guidelines

For Small-Scale Code

For Large-Scale Code

Code References

Limitations and Considerations