portfolio-optimization

SKILL.md

Portfolio Optimization

Overview

This skill provides guidance for implementing high-performance portfolio optimization algorithms using Python C extensions. It covers the workflow for creating C extensions that interface with NumPy arrays, proper verification strategies, and common pitfalls to avoid when optimizing numerical computations.

When to Apply This Skill

Apply this skill when:

  • Implementing portfolio risk calculations (variance, volatility, Sharpe ratio)
  • Optimizing matrix-vector operations for large asset portfolios
  • Creating C extensions for Python numerical code
  • Performance requirements specify speedup ratios (e.g., >= 1.2x)
  • Working with covariance matrices and portfolio weights

Recommended Workflow

Phase 1: Codebase Understanding

Before writing any code:

  1. Read all relevant source files completely - Understand the baseline implementation, data structures, and expected interfaces
  2. Identify the mathematical operations - Common operations include:
    • Matrix-vector multiplication (covariance matrix times weights)
    • Dot products (weights times returns)
    • Square root operations (for volatility from variance)
  3. Understand the test suite - Know what correctness tolerances are expected (e.g., 1e-10) and what performance benchmarks must be met
  4. Document the input/output contracts - Array shapes, data types (typically float64), and return value specifications

Phase 2: Implementation Planning

Consider these factors before implementation:

  1. Why C provides speedup:

    • Eliminates Python interpreter overhead
    • Enables direct memory access without bounds checking
    • Allows compiler optimizations (vectorization, loop unrolling)
    • Reduces temporary array allocations
  2. Design decisions to make:

    • Whether to use NumPy C API for zero-copy array access
    • Memory layout assumptions (C-contiguous vs Fortran-contiguous)
    • Error handling strategy for type mismatches and dimension errors
  3. Potential algorithmic optimizations:

    • Cache-friendly memory access patterns (row-major iteration for C arrays)
    • SIMD vectorization opportunities
    • Minimizing Python-to-C data conversion overhead

Phase 3: C Extension Implementation

When implementing the C extension:

  1. Include proper headers:

    • Python.h (must be first)
    • numpy/arrayobject.h for NumPy array access
  2. Initialize NumPy in the module init function:

    • Call import_array() to initialize NumPy C API
  3. Use NumPy C API for array access:

    • PyArray_DATA() for getting data pointer
    • PyArray_DIM() for dimensions
    • PyArray_STRIDE() for memory strides
    • Check PyArray_IS_C_CONTIGUOUS() for memory layout
  4. Implement robust error handling:

    • Validate array dimensions match expected shapes
    • Check data types (expect NPY_FLOAT64 for double precision)
    • Handle non-contiguous arrays (either reject or handle strides)
    • Set appropriate Python exceptions on error

Phase 4: Python Wrapper Implementation

Create a Python module that:

  1. Imports the C extension module
  2. Provides a clean interface matching the baseline API
  3. Handles any necessary array preparation (ensuring contiguity)
  4. Documents the interface clearly

Phase 5: Verification Strategy

Critical: Verify every change completely

  1. After editing files, re-read them - Confirm edits were applied correctly, especially for multi-line changes

  2. Test incrementally:

    • Build the C extension first and verify it compiles
    • Test individual functions before running full benchmarks
    • Use small test cases for correctness verification before scaling up
  3. Correctness verification:

    • Compare outputs against baseline implementation
    • Use appropriate numerical tolerances (typically 1e-10 for double precision)
    • Test with known inputs where expected outputs can be calculated manually
  4. Performance verification:

    • Run benchmarks with representative data sizes
    • Verify speedup meets requirements across different portfolio sizes
    • Test edge cases: small portfolios (n=1, n=10), large portfolios (n=5000+)

Edge Cases to Handle

Ensure the implementation addresses:

  1. Empty portfolios (n=0) - Return appropriate default or error
  2. Single-asset portfolios (n=1) - Degenerate case for covariance
  3. Dimension mismatches - Weights vector length vs covariance matrix dimensions
  4. Invalid inputs:
    • Non-square covariance matrices
    • NaN or infinity values in inputs
    • Negative variance (mathematically invalid)
  5. Memory considerations:
    • Non-contiguous NumPy arrays
    • Memory allocation failures in C code
    • Large portfolios that may stress memory

Common Pitfalls to Avoid

Code Completeness

  • Never truncate code in edit operations - always provide complete implementations
  • Verify file contents after editing to confirm changes applied correctly
  • Document all design choices explicitly

Testing Approach

  • Avoid going directly from implementation to full benchmark testing
  • Test each function individually before integration testing
  • Do not rely solely on "tests pass" for validation - understand why they pass

C Extension Specific

  • Always check NumPy array types before accessing data
  • Handle reference counting properly to avoid memory leaks
  • Initialize NumPy API with import_array() in module init
  • Use PyErr_SetString() to set exceptions on errors

Performance Validation

  • Verify speedup is consistent across different input sizes
  • Profile if further optimizations might be needed
  • Consider the overhead of Python-to-C transitions for small inputs

Build and Test Commands

Typical workflow commands:

# Build the C extension
python setup.py build_ext --inplace

# Run correctness tests
python -c "from portfolio_optimized import *; # test calls"

# Run benchmark
python benchmark.py

# Run full test suite
pytest test_portfolio.py -v

Verification Checklist

Before considering the task complete:

  • All source files read and understood
  • C extension compiles without warnings
  • Individual functions tested for correctness
  • Numerical results match baseline within tolerance
  • Performance meets speedup requirements
  • Edge cases explicitly tested or handled
  • Error handling implemented for invalid inputs
  • File contents verified after all edits
  • No memory leaks in C code (proper reference counting)
Weekly Installs
1
Installed on
windsurf1
opencode1
codex1
claude-code1
antigravity1
gemini-cli1