python-performance-optimization
Python Performance Optimization
Profile, analyze, and optimize Python code for better performance - CPU profiling, memory optimization, and implementation best practices.
When to Invoke
- User reports slow Python code or asks to speed up execution
- Profiling or benchmarking Python applications
- Reducing CPU time, memory consumption, or I/O wait
- Optimizing database queries or data processing pipelines
- Debugging memory leaks or excessive memory usage
- Choosing between parallelization strategies (threading, multiprocessing, async)
- Evaluating algorithmic vs implementation-level improvements
Core Concepts
Profiling Types
- CPU Profiling: Identify time-consuming functions (cProfile, py-spy)
- Memory Profiling: Track memory allocation and leaks (tracemalloc, memory_profiler)
- Line Profiling: Profile at line-by-line granularity (line_profiler)
- Call Graph: Visualize function call relationships
Performance Metrics
- Execution Time: How long operations take
- Memory Usage: Peak and average memory consumption
- CPU Utilization: Processor usage patterns
- I/O Wait: Time spent on I/O operations
Optimization Strategies
- Algorithmic: Better algorithms and data structures
- Implementation: More efficient code patterns
- Parallelization: Multi-threading/processing
- Caching: Avoid redundant computation
- Native Extensions: C/Rust for critical paths
Quick Start
import time
import timeit
# Simple timing
start = time.time()
result = sum(range(1000000))
print(f"Execution time: {time.time() - start:.4f} seconds")
# Accurate benchmarking with timeit
execution_time = timeit.timeit("sum(range(1000000))", number=100)
print(f"Average time: {execution_time/100:.6f} seconds")
Profiling Tools Summary
cProfile - CPU Profiling
python -m cProfile -o output.prof script.py
python -m pstats output.prof
line_profiler - Line-by-Line
uv add --dev line-profiler # or: uv tool install line-profiler
kernprof -l -v script.py
memory_profiler - Memory Usage
uv add --dev memory-profiler
python -m memory_profiler script.py
py-spy - Production Profiling
uv tool install py-spy
py-spy record -o profile.svg -- python script.py
py-spy top --pid 12345
Key Optimization Patterns
Data Structure Selection
- Dict/Set for lookups: O(1) vs O(n) for list search
- Generators for large datasets: Constant memory vs full list
- slots on classes: Reduces per-instance memory
Code-Level Optimizations
- List comprehensions over loops (faster C implementation)
str.join()over+=concatenation- Local variables over global access in hot loops
- Inline simple operations in tight loops
- Built-in functions (implemented in C)
Caching
functools.lru_cachefor expensive pure functionsweakref.WeakValueDictionaryfor GC-friendly caches
Parallelization
- multiprocessing: CPU-bound tasks, true parallelism
- threading: I/O-bound tasks with shared memory
- asyncio: I/O-bound tasks with many concurrent operations
Memory Optimization
tracemallocfor detecting memory leaks (snapshot comparison)- Iterators over lists for file/stream processing
weakrefcaches to allow garbage collection
Database Optimization
- Batch operations with
executemany()and single commit - Index frequently queried columns
- Select only needed columns (avoid
SELECT *) - Use
EXPLAIN QUERY PLANfor analysis
Benchmarking
from functools import wraps
import time
def benchmark(func):
"""Decorator to benchmark function execution."""
@wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f"{func.__name__} took {elapsed:.6f} seconds")
return result
return wrapper
For pytest-based benchmarking: pip install pytest-benchmark
Best Practices
- Profile before optimizing - measure to find real bottlenecks
- Focus on hot paths - optimize code that runs most frequently
- Use appropriate data structures - dict for lookups, set for membership
- Avoid premature optimization - clarity first, then optimize
- Use built-in functions - they are implemented in C
- Cache expensive computations - use lru_cache
- Batch I/O operations - reduce system calls
- Use generators for large datasets
- Consider NumPy for numerical operations
- Profile production code - use py-spy for live systems
Common Pitfalls
- Optimizing without profiling
- Using global variables unnecessarily
- Not using appropriate data structures
- Creating unnecessary copies of data
- Not using connection pooling for databases
- Ignoring algorithmic complexity
- Over-optimizing rare code paths
- Not considering memory usage
Performance Checklist
- Profiled code to identify bottlenecks
- Used appropriate data structures
- Implemented caching where beneficial
- Optimized database queries
- Used generators for large datasets
- Considered multiprocessing for CPU-bound tasks
- Used async I/O for I/O-bound tasks
- Minimized function call overhead in hot loops
- Checked for memory leaks
- Benchmarked before and after optimization
References
references/optimization-patterns.md- detailed code examples for all profiling tools, optimization patterns (list comprehensions, generators, string concat, dict lookups, local vars, function call overhead), advanced optimization (NumPy, lru_cache, slots, multiprocessing, async I/O), database optimization, memory leak detection, and benchmarking tools
Resources
- cProfile: Built-in CPU profiler
- memory_profiler: Memory usage profiling
- line_profiler: Line-by-line profiling
- py-spy: Sampling profiler for production
- NumPy: High-performance numerical computing
- Cython: Compile Python to C
- PyPy: Alternative Python interpreter with JIT
More from acaprino/alfio-claude-plugins
python-refactor
>
154file-organizer
>
60legal-advisor
Use PROACTIVELY for any legal question -- contracts, compliance, privacy, IP, employment law, terms of service, NDAs, corporate governance. Expert legal advisor specializing in technology law, compliance, and risk mitigation.
39deep-dive-analysis
>
34python-comments
>
34tauri2-mobile
Expert guidance for developing, testing, and deploying mobile applications with Tauri 2. Use when working with Tauri 2 mobile development for Android/iOS, including project setup, Rust backend patterns, frontend integration, plugin usage (biometric, geolocation, notifications, IAP), emulator/ADB testing, code signing, and Play Store/App Store deployment.
29