Python Performance Optimization

Profile, analyze, and optimize Python code for better performance - CPU profiling, memory optimization, and implementation best practices.

When to Invoke

User reports slow Python code or asks to speed up execution
Profiling or benchmarking Python applications
Reducing CPU time, memory consumption, or I/O wait
Optimizing database queries or data processing pipelines
Debugging memory leaks or excessive memory usage
Choosing between parallelization strategies (threading, multiprocessing, async)
Evaluating algorithmic vs implementation-level improvements

Core Concepts

Profiling Types

CPU Profiling: Identify time-consuming functions (cProfile, py-spy)
Memory Profiling: Track memory allocation and leaks (tracemalloc, memory_profiler)
Line Profiling: Profile at line-by-line granularity (line_profiler)
Call Graph: Visualize function call relationships

Performance Metrics

Execution Time: How long operations take
Memory Usage: Peak and average memory consumption
CPU Utilization: Processor usage patterns
I/O Wait: Time spent on I/O operations

Optimization Strategies

Algorithmic: Better algorithms and data structures
Implementation: More efficient code patterns
Parallelization: Multi-threading/processing
Caching: Avoid redundant computation
Native Extensions: C/Rust for critical paths

Quick Start

import time
import timeit

# Simple timing
start = time.time()
result = sum(range(1000000))
print(f"Execution time: {time.time() - start:.4f} seconds")

# Accurate benchmarking with timeit
execution_time = timeit.timeit("sum(range(1000000))", number=100)
print(f"Average time: {execution_time/100:.6f} seconds")

Profiling Tools Summary

cProfile - CPU Profiling

python -m cProfile -o output.prof script.py
python -m pstats output.prof

line_profiler - Line-by-Line

uv add --dev line-profiler    # or: uv tool install line-profiler
kernprof -l -v script.py

memory_profiler - Memory Usage

uv add --dev memory-profiler
python -m memory_profiler script.py

py-spy - Production Profiling

uv tool install py-spy
py-spy record -o profile.svg -- python script.py
py-spy top --pid 12345

Key Optimization Patterns

Data Structure Selection

Dict/Set for lookups: O(1) vs O(n) for list search
Generators for large datasets: Constant memory vs full list
slots on classes: Reduces per-instance memory

Code-Level Optimizations

List comprehensions over loops (faster C implementation)
str.join() over += concatenation
Local variables over global access in hot loops
Inline simple operations in tight loops
Built-in functions (implemented in C)

Caching

functools.lru_cache for expensive pure functions
weakref.WeakValueDictionary for GC-friendly caches

Parallelization

multiprocessing: CPU-bound tasks, true parallelism
threading: I/O-bound tasks with shared memory
asyncio: I/O-bound tasks with many concurrent operations

Memory Optimization

tracemalloc for detecting memory leaks (snapshot comparison)
Iterators over lists for file/stream processing
weakref caches to allow garbage collection

Database Optimization

Batch operations with executemany() and single commit
Index frequently queried columns
Select only needed columns (avoid SELECT *)
Use EXPLAIN QUERY PLAN for analysis

Benchmarking

from functools import wraps
import time

def benchmark(func):
    """Decorator to benchmark function execution."""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        print(f"{func.__name__} took {elapsed:.6f} seconds")
        return result
    return wrapper

For pytest-based benchmarking: pip install pytest-benchmark

Best Practices

Profile before optimizing - measure to find real bottlenecks
Focus on hot paths - optimize code that runs most frequently
Use appropriate data structures - dict for lookups, set for membership
Avoid premature optimization - clarity first, then optimize
Use built-in functions - they are implemented in C
Cache expensive computations - use lru_cache
Batch I/O operations - reduce system calls
Use generators for large datasets
Consider NumPy for numerical operations
Profile production code - use py-spy for live systems

Common Pitfalls

Optimizing without profiling
Using global variables unnecessarily
Not using appropriate data structures
Creating unnecessary copies of data
Not using connection pooling for databases
Ignoring algorithmic complexity
Over-optimizing rare code paths
Not considering memory usage

Performance Checklist

References

references/optimization-patterns.md - detailed code examples for all profiling tools, optimization patterns (list comprehensions, generators, string concat, dict lookups, local vars, function call overhead), advanced optimization (NumPy, lru_cache, slots, multiprocessing, async I/O), database optimization, memory leak detection, and benchmarking tools

Resources

cProfile: Built-in CPU profiler
memory_profiler: Memory usage profiling
line_profiler: Line-by-line profiling
py-spy: Sampling profiler for production
NumPy: High-performance numerical computing
Cython: Compile Python to C
PyPy: Alternative Python interpreter with JIT

python-performance-optimization

Python Performance Optimization

When to Invoke

Core Concepts

Profiling Types

Performance Metrics

Optimization Strategies

Quick Start

Profiling Tools Summary

cProfile - CPU Profiling

line_profiler - Line-by-Line

memory_profiler - Memory Usage

py-spy - Production Profiling

Key Optimization Patterns

Data Structure Selection

Code-Level Optimizations

Caching

Parallelization

Memory Optimization

Database Optimization

Benchmarking

Best Practices

Common Pitfalls

Performance Checklist

References

Resources

More from acaprino/alfio-claude-plugins

python-refactor

file-organizer

legal-advisor

deep-dive-analysis

python-comments

tauri2-mobile