python-refactor

SKILL.md

Python Refactor

Purpose

Transform complex, hard-to-understand Python code into clear, well-documented, maintainable code while preserving correctness. This skill guides systematic refactoring that prioritizes human comprehension without sacrificing correctness or reasonable performance.

When to Invoke

Invoke this skill when:

  • User explicitly requests "human", "readable", "maintainable", "clean", or "refactor" code improvements
  • Code review processes flag comprehension or maintainability issues
  • Working with legacy code that needs modernization
  • Preparing code for team onboarding or educational contexts
  • Code complexity metrics exceed reasonable thresholds
  • Functions or modules are difficult to understand or modify
  • RED FLAG indicators: file >500 lines with scattered functions and global state, multiple global statements, no clear module/class organization, configuration mixed with business logic

Do NOT invoke this skill when:

  • Code is performance-critical and profiling shows optimization is needed first
  • Code is scheduled for deletion or replacement
  • External dependencies require upstream contributions instead
  • User explicitly requests performance optimization over readability

Core Principles

Follow these principles in priority order:

  1. Prefer structured OOP for complex code - Code with shared state, multiple concerns, or scattered global functions should be restructured into well-organized classes and modules. Script-like code with global state and tangled dependencies benefits most from OOP. However, simple modules with pure functions, CLI tools using click/argparse, and functional data pipelines don't need to be forced into classes.
  2. Clarity over cleverness - Explicit, obvious code beats implicit, clever code
  3. Preserve correctness - All tests must pass; behavior must remain identical
  4. Single Responsibility - Each class and function should do one thing well (SOLID principles)
  5. Self-documenting structure - Code structure tells what, comments explain why
  6. Progressive disclosure - Reveal complexity in layers, not all at once
  7. Reasonable performance - Never sacrifice >2x performance without explicit approval

Key Constraints

ALWAYS observe these constraints:

  • SAFETY BY DESIGN - Use mandatory migration checklists for destructive changes. Create new structure, search all usages, migrate all, verify, only then remove old code. NEVER remove code before 100% migration verified.
  • STATIC ANALYSIS FIRST - Run flake8 --select=F821,E0602 before tests to catch NameErrors immediately
  • PRESERVE BEHAVIOR - All existing tests must pass after refactoring
  • NO PERFORMANCE REGRESSION - Never degrade performance >2x without explicit user approval
  • NO API CHANGES - Public APIs remain unchanged unless explicitly requested and documented
  • NO OVER-ENGINEERING - Simple code stays simple; don't add unnecessary abstraction
  • NO MAGIC - No framework magic, no metaprogramming unless absolutely necessary
  • VALIDATE CONTINUOUSLY - Run static analysis + tests after each logical change

Regression Prevention (MANDATORY)

Refactoring must NEVER introduce technical, logical, or functional regressions.

Read and apply references/REGRESSION_PREVENTION.md before any refactoring session.

Before each refactoring session:

  • Test suite passes at 100%
  • Coverage >= 80% on target code (if not, write tests FIRST)
  • Golden outputs captured for critical edge cases
  • Static analysis baseline saved

After each micro-change (not at the end, EVERY SINGLE ONE):

  • flake8 --select=F821,E999 -> 0 errors
  • pytest -x -> all passing
  • Spot check 1 edge case for unchanged behavior

If ANY check fails: STOP -> REVERT -> ANALYZE -> FIX APPROACH -> RETRY

ANY REGRESSION = TOTAL FAILURE OF THE REFACTORING

Refactoring Workflow

Execute refactoring in four phases with validation at each step.

Phase 1: Analysis

Before making any changes, analyze the code comprehensively:

  1. Read the entire codebase section being refactored to understand context
  2. Identify readability issues using the anti-patterns reference (see references/anti-patterns.md):
    • Check for script-like/procedural code (global state, scattered functions, no clear structure)
    • Check for God Objects/Classes (classes doing too much)
    • Complex nested conditionals, long functions, magic numbers, cryptic names, etc.
  3. Assess architecture (see references/oop_principles.md):
    • Is code organized in proper classes and modules?
    • Is there global state that should be encapsulated?
    • Are responsibilities properly separated?
    • Are SOLID principles followed?
    • Is dependency injection used instead of hard-coded dependencies?
  4. Measure current metrics using scripts/measure_complexity.py or scripts/analyze_multi_metrics.py
  5. Run linting analysis (see Tooling Recommendations below for which tool to use)
  6. Check test coverage - Identify gaps that need filling before refactoring
  7. Document findings using the analysis template (see assets/templates/analysis_template.md)

Output: Prioritized list of issues by impact and risk.

Phase 2: Planning

Plan the refactoring approach systematically with safety-by-design:

  1. Identify changes by type:

    • Non-destructive: Renames, documentation, type hints -> Low risk
    • Destructive: Removing globals, deleting functions, replacing APIs -> High risk
  2. For DESTRUCTIVE changes - CREATE MIGRATION PLAN (MANDATORY):

    • Search for ALL usages of each element to be removed
    • Document every found usage with file, line number, and usage type
    • If you cannot create a complete migration plan, you CANNOT proceed with the destructive change
  3. Risk assessment for each proposed change (Low/Medium/High)

  4. Dependency identification - What else depends on this code?

  5. Test strategy - What tests are needed? What might break?

  6. Change ordering - Sequence changes from safest to riskiest

  7. Expected outcomes - Document what metrics should improve and by how much

Output: Refactoring plan with sequenced changes, migration plans for destructive changes, test strategy, and rollback plan.

Phase 3: Execution

Apply refactoring patterns using safety-by-design workflow.

For NON-DESTRUCTIVE changes (safe to do anytime):

  1. Rename variables/functions for clarity
  2. Extract magic numbers/strings to named constants
  3. Add/improve documentation and type hints
  4. Add guard clauses to reduce nesting

For DESTRUCTIVE changes (removing/replacing code) - STRICT PROTOCOL:

  1. CREATE new structure (no removal yet) - write new classes/functions, add tests
  2. SEARCH comprehensively for ALL usages of the element being removed
  3. CREATE migration checklist documenting every found usage
  4. MIGRATE one usage at a time, checking off the list, running static analysis + tests after each
  5. VERIFY complete migration - re-run original searches, should find zero old references
  6. REMOVE old code only after 100% migration verified

Execution Rules

  1. NEVER skip the migration checklist for destructive changes
  2. Run static analysis BEFORE tests - Catch NameErrors immediately
  3. One pattern at a time - Never mix multiple refactoring patterns in one change
  4. Atomic commits - Each migration step gets its own commit
  5. Stop on ANY error - Static analysis errors OR test failures require immediate fix/revert

Refactoring order (recommended sequence):

  1. Transform script-like code to proper architecture (if code has global state and scattered functions). See references/examples/script_to_oop_transformation.md
  2. Rename variables/functions for clarity
  3. Extract magic numbers/strings to named constants (as class constants or enums)
  4. Add/improve documentation and type hints
  5. Extract methods to reduce function length
  6. Simplify conditionals with guard clauses
  7. Reduce nesting depth
  8. Final review: Ensure separation of concerns is clean

Output: Refactored code passing all tests with clear commit history.

Phase 4: Validation

Validate improvements objectively:

  1. Run static analysis FIRST (catch errors before tests):

    flake8 <file> --select=F821,E0602  # Undefined names/variables
    flake8 <file> --select=F401        # Unused imports
    flake8 <file>                       # Full quality check
    

    MANDATORY: Zero F821 and E0602 errors required

  2. Run full test suite - 100% pass rate required

  3. Validate architecture improvements:

    • Confirm global state has been eliminated or properly encapsulated
    • Verify code is organized in proper modules/classes
    • Check that responsibilities are properly separated
    • Validate against SOLID principles (see references/oop_principles.md)
  4. Compare before/after metrics using scripts/measure_complexity.py or scripts/analyze_multi_metrics.py

  5. Performance regression check - Run scripts/benchmark_changes.py for hot paths

  6. Generate summary report using format from assets/templates/summary_template.md

  7. Flag for human review if:

    • Performance degraded >10%
    • Public API signatures changed
    • Test coverage decreased
    • Significant architectural changes were made

Output: Comprehensive validation report with test results, metrics comparison, performance benchmarks, and quality summary.

Refactoring Patterns

Apply these patterns systematically. See references/patterns.md for full catalog with examples.

Key Patterns (summary)

  • Guard Clauses - Replace nested conditionals with early returns. See references/patterns.md
  • Extract Method - Split large functions into focused units. Resets nesting counter (most powerful for cognitive complexity)
  • Dictionary Dispatch - Eliminate if-elif chains with lookup tables
  • Match Statement (Python 3.10+) - switch counts as +1 total, not per branch
  • Named Boolean Conditions - Extract complex boolean expressions into named variables
  • Encapsulate Global State - Move globals into classes with proper encapsulation
  • Group Related Functions - Organize scattered functions into classes by responsibility
  • Create Domain Models - Replace primitive dicts with dataclasses and enums
  • Apply Dependency Injection - Replace hard-coded dependencies with injected ones

See references/cognitive_complexity_guide.md for cognitive complexity calculation rules and reduction patterns.

Naming Conventions

  • Variables: Descriptive names, booleans as is_active/has_permission/can_edit, collections as plurals
  • Functions: Verb + object (calculate_total, validate_email), boolean queries as is_valid()/has_items()
  • Constants: UPPERCASE_WITH_UNDERSCORES, replace magic numbers/strings
  • Classes: PascalCase nouns (UserAccount, PaymentProcessor)

Documentation Patterns

  • Function Docstrings - Document purpose, args, returns, raises (Google style preferred)
  • Module Documentation - Purpose and key dependencies
  • Inline Comments - Only for non-obvious "why"
  • Type Hints - All public APIs and complex internals

OOP Transformation Patterns

For transforming script-like code to structured OOP. See references/examples/script_to_oop_transformation.md for a complete guide and references/oop_principles.md for SOLID principles.

Anti-Patterns to Fix

See references/anti-patterns.md for the full catalog. Priority order:

Critical: Script-like/procedural code with global state, God Object/God Class High: Complex nested conditionals (>3 levels), long functions (>30 lines), magic numbers, cryptic names, missing type hints, missing docstrings Medium: Duplicate code, primitive obsession, long parameter lists (>5) Low: Inconsistent naming, redundant comments, unused imports

Tooling Recommendations

Primary Stack: Ruff + Complexipy (recommended for new projects)

pip install ruff complexipy radon wily

ruff check src/                              # Fast linting (Rust, replaces flake8+plugins)
complexipy src/ --max-complexity-allowed 15  # Cognitive complexity (Rust)
radon mi src/ -s                             # Maintainability Index

See references/cognitive_complexity_guide.md for complete configuration (pyproject.toml, pre-commit hooks, GitHub Actions, CLI usage).

Alternative: Flake8 (for projects already using it)

The scripts/analyze_with_flake8.py and scripts/compare_flake8_reports.py scripts use flake8. See references/flake8_plugins_guide.md for the curated plugin list.

Multi-Metric Analysis

Use scripts/analyze_multi_metrics.py to combine cognitive complexity (complexipy), cyclomatic complexity (radon), and maintainability index in a single report.

Metric Tool Use
Cognitive Complexity complexipy Human comprehension
Cyclomatic Complexity ruff (C901), radon Test planning
Maintainability Index radon Overall code health

Metric Targets

  • Cyclomatic complexity: <10 per function (warning at 15, error at 20)
  • Cognitive complexity: <15 per function (SonarQube default, warning at 20)
  • Function length: <30 lines (warning at 50)
  • Nesting depth: <=3 levels
  • Docstring coverage: >80% for public functions
  • Type hint coverage: >90% for public APIs

Historical Tracking with Wily

Monitor trends over time, not just thresholds. See references/cognitive_complexity_guide.md for setup and CI integration.

Common Refactoring Mistakes

See references/REGRESSION_PREVENTION.md for the full guide. Key traps:

  1. Incomplete Migration - Removing old code before ALL usages are migrated (causes NameErrors)
  2. Partial Pattern Application - Applying refactoring to some functions but not others
  3. Breaking Public APIs - Changing function signatures used by external code
  4. Assuming Tests Cover Everything - Tests pass but runtime errors occur (run static analysis!)

Output Format

Structure refactoring output using the template from assets/templates/summary_template.md. Include:

  • Changes made with rationale and risk level
  • Before/after metrics comparison table
  • Test results and performance impact
  • Risk assessment and human review recommendation

Related tools -- when to use what

  • humanize (agent, humanize plugin) -- Multi-language cosmetic cleanup. Renames local variables, improves comments, simplifies structure. Lowest regression risk. Use for: "make this readable", "clean up naming".
  • python-refactor (this skill) -- Python-only deep restructuring. OOP transformation, SOLID principles, complexity metrics, migration checklists, benchmark validation. Use for: "refactor this module", "reduce complexity", "transform to OOP".

Escalation path: humanize -> python-refactor (from safest to most thorough).

Integration with Same-Package Skills

  • python-tdd - Set up tests before refactoring, validate coverage after
  • python-performance-optimization - Deep profiling before/after refactoring
  • python-packaging - If refactoring a library, handle pyproject.toml and distribution
  • uv-package-manager - Use uv run ruff, uv run complexipy for tool execution
  • async-python-patterns - Reference async patterns when refactoring async code

Edge Cases and Limitations

When NOT to Refactor: Performance-critical optimized code (profile first), code scheduled for deletion, external dependencies (contribute upstream), stable legacy code nobody needs to modify.

Limitations: Cannot improve algorithmic complexity (that's algorithm change, not refactoring). Cannot add domain knowledge not in code/comments. Cannot guarantee correctness without tests. Code style preferences vary - adjust based on team conventions.

Examples

See references/examples/ for before/after examples:

  • script_to_oop_transformation.md - Complete transformation from script-like code to clean OOP architecture
  • python_complexity_reduction.md - Nested conditionals and long functions
  • typescript_naming_improvements.md - Variable and function naming patterns (cross-language reference)

Success Criteria

Refactoring is successful when:

  1. ZERO regressions - All existing tests pass, behavior unchanged
  2. Golden master match - Identical output for documented critical cases
  3. Complexity metrics improved (documented in summary)
  4. No performance regression >10% (or explicit approval obtained)
  5. Documentation coverage improved
  6. Code is easier for humans to understand
  7. No new security vulnerabilities introduced
  8. Changes are atomic and well-documented in git history
  9. Wily trend - Complexity not increased compared to previous commit
  10. Static analysis shows improvement
Weekly Installs
70
First Seen
Feb 4, 2026
Installed on
codex67
opencode66
gemini-cli63
cursor60
github-copilot59
kimi-cli55