refactoring-catalog
Refactoring Catalog
Apply proven refactoring patterns to improve code structure while preserving behavior. Based on Martin Fowler's refactoring catalog with practical Python examples.
When to Use This Skill
- Reducing code complexity and improving readability
- Eliminating code smells and technical debt
- Preparing code for new features
- Improving testability of existing code
- Applying design patterns to solve structural issues
- Breaking up large classes or functions
Safe Refactoring Workflow
The Golden Rule
Never change behavior while refactoring. Never refactor while changing behavior.
Refactoring Loop
1. Ensure tests pass (baseline)
2. Make ONE small change
3. Run tests
4. If tests pass → commit
5. If tests fail → revert and try smaller step
6. Repeat
Before Starting
# Ensure clean state
git status # Should be clean
# Run tests to establish baseline
uv run pytest -v
# Check coverage for area being refactored
uv run pytest --cov=src/module --cov-report=term-missing
After Each Change
# Verify behavior preserved
uv run pytest -v
# Check types still valid
uv run mypy src/
# Ensure style compliance
uv run ruff check src/
Code Smells Reference
Method-Level Smells
| Smell | Signs | Refactoring |
|---|---|---|
| Long Method | > 20 lines, multiple levels of abstraction | Extract Method |
| Long Parameter List | > 3 parameters | Introduce Parameter Object |
| Duplicated Code | Same code in multiple places | Extract Method, Pull Up Method |
| Complex Conditional | Nested if/else, switch statements | Replace Conditional with Polymorphism |
| Feature Envy | Method uses another class's data extensively | Move Method |
| Data Clumps | Same group of data appears together | Extract Class |
Class-Level Smells
| Smell | Signs | Refactoring |
|---|---|---|
| Large Class | > 200 lines, many responsibilities | Extract Class |
| God Class | Knows/does too much | Extract Class, Move Method |
| Data Class | Only getters/setters, no behavior | Move behavior into class |
| Refused Bequest | Subclass doesn't use inherited methods | Replace Inheritance with Delegation |
| Parallel Inheritance | Every subclass requires another subclass | Collapse Hierarchy |
Architecture Smells
| Smell | Signs | Refactoring |
|---|---|---|
| Shotgun Surgery | One change requires editing many classes | Move Method, Inline Class |
| Divergent Change | One class changed for different reasons | Extract Class |
| Inappropriate Intimacy | Classes know too much about each other | Move Method, Hide Delegate |
Refactoring Patterns
Extract Method
When: Long method, duplicated code, code needs a comment to explain
Before:
def process_order(order: Order) -> None:
# Validate order
if not order.items:
raise ValueError("Order must have items")
if order.total < 0:
raise ValueError("Order total cannot be negative")
for item in order.items:
if item.quantity <= 0:
raise ValueError(f"Invalid quantity for {item.name}")
# Calculate discount
discount = 0.0
if order.customer.is_premium:
discount = order.total * 0.1
if order.total > 1000:
discount += order.total * 0.05
# Process payment
final_total = order.total - discount
payment = Payment(amount=final_total)
payment.process()
# Send confirmation
email = Email(to=order.customer.email)
email.subject = "Order Confirmation"
email.body = f"Your order #{order.id} has been processed."
email.send()
After:
def process_order(order: Order) -> None:
validate_order(order)
discount = calculate_discount(order)
process_payment(order, discount)
send_confirmation(order)
def validate_order(order: Order) -> None:
"""Validate order has items with valid quantities."""
if not order.items:
raise ValueError("Order must have items")
if order.total < 0:
raise ValueError("Order total cannot be negative")
for item in order.items:
if item.quantity <= 0:
raise ValueError(f"Invalid quantity for {item.name}")
def calculate_discount(order: Order) -> float:
"""Calculate total discount based on customer and order value."""
discount = 0.0
if order.customer.is_premium:
discount = order.total * 0.1
if order.total > 1000:
discount += order.total * 0.05
return discount
def process_payment(order: Order, discount: float) -> None:
"""Process payment for the order minus discount."""
final_total = order.total - discount
payment = Payment(amount=final_total)
payment.process()
def send_confirmation(order: Order) -> None:
"""Send order confirmation email to customer."""
email = Email(to=order.customer.email)
email.subject = "Order Confirmation"
email.body = f"Your order #{order.id} has been processed."
email.send()
Introduce Parameter Object
When: Multiple functions take the same group of parameters
Before:
def deploy_module(
name: str,
version: str,
environment: str,
region: str,
timeout: int,
retries: int,
) -> DeployResult:
...
def validate_deployment(
name: str,
version: str,
environment: str,
region: str,
) -> bool:
...
def rollback_deployment(
name: str,
version: str,
environment: str,
region: str,
) -> None:
...
After:
@dataclass(frozen=True)
class DeploymentConfig:
"""Configuration for a deployment operation."""
name: str
version: str
environment: str
region: str
timeout: int = 300
retries: int = 3
def deploy_module(config: DeploymentConfig) -> DeployResult:
...
def validate_deployment(config: DeploymentConfig) -> bool:
...
def rollback_deployment(config: DeploymentConfig) -> None:
...
Replace Conditional with Polymorphism
When: Switch/if-else on type codes, behavior varies by type
Before:
class Employee:
def __init__(self, type_code: str, base_salary: float):
self.type_code = type_code
self.base_salary = base_salary
def calculate_pay(self) -> float:
if self.type_code == "engineer":
return self.base_salary * 1.2
elif self.type_code == "manager":
return self.base_salary * 1.5 + 1000
elif self.type_code == "intern":
return self.base_salary * 0.5
else:
return self.base_salary
def get_title(self) -> str:
if self.type_code == "engineer":
return "Software Engineer"
elif self.type_code == "manager":
return "Engineering Manager"
elif self.type_code == "intern":
return "Intern"
else:
return "Employee"
After:
from abc import ABC, abstractmethod
class Employee(ABC):
"""Base employee with common behavior."""
def __init__(self, base_salary: float):
self.base_salary = base_salary
@abstractmethod
def calculate_pay(self) -> float:
"""Calculate total pay for this employee type."""
...
@abstractmethod
def get_title(self) -> str:
"""Get the job title for this employee type."""
...
class Engineer(Employee):
def calculate_pay(self) -> float:
return self.base_salary * 1.2
def get_title(self) -> str:
return "Software Engineer"
class Manager(Employee):
def calculate_pay(self) -> float:
return self.base_salary * 1.5 + 1000
def get_title(self) -> str:
return "Engineering Manager"
class Intern(Employee):
def calculate_pay(self) -> float:
return self.base_salary * 0.5
def get_title(self) -> str:
return "Intern"
Extract Class
When: Class has too many responsibilities, groups of data belong together
Before:
class Order:
def __init__(self):
self.items: list[Item] = []
# Customer data mixed in
self.customer_name: str = ""
self.customer_email: str = ""
self.customer_phone: str = ""
self.customer_address: str = ""
self.customer_is_premium: bool = False
# Shipping data mixed in
self.shipping_method: str = ""
self.shipping_address: str = ""
self.shipping_cost: float = 0.0
self.estimated_delivery: datetime | None = None
def get_customer_contact(self) -> str:
return f"{self.customer_email} / {self.customer_phone}"
def calculate_shipping(self) -> float:
if self.shipping_method == "express":
return 25.0
return 10.0
After:
@dataclass
class Customer:
"""Customer information."""
name: str
email: str
phone: str
address: str
is_premium: bool = False
def get_contact(self) -> str:
return f"{self.email} / {self.phone}"
@dataclass
class ShippingInfo:
"""Shipping details for an order."""
method: str
address: str
cost: float = 0.0
estimated_delivery: datetime | None = None
def calculate_cost(self) -> float:
if self.method == "express":
return 25.0
return 10.0
@dataclass
class Order:
"""Order with separated concerns."""
items: list[Item]
customer: Customer
shipping: ShippingInfo
Replace Magic Numbers with Constants
When: Numeric literals appear without explanation
Before:
def calculate_price(base: float, quantity: int) -> float:
if quantity > 100:
return base * quantity * 0.85
elif quantity > 50:
return base * quantity * 0.9
elif quantity > 10:
return base * quantity * 0.95
return base * quantity
def is_valid_password(password: str) -> bool:
return 8 <= len(password) <= 128
After:
# Discount thresholds and rates
BULK_DISCOUNT_THRESHOLD = 100
LARGE_ORDER_THRESHOLD = 50
SMALL_ORDER_THRESHOLD = 10
BULK_DISCOUNT_RATE = 0.85
LARGE_ORDER_DISCOUNT_RATE = 0.90
SMALL_ORDER_DISCOUNT_RATE = 0.95
# Password constraints
MIN_PASSWORD_LENGTH = 8
MAX_PASSWORD_LENGTH = 128
def calculate_price(base: float, quantity: int) -> float:
if quantity > BULK_DISCOUNT_THRESHOLD:
return base * quantity * BULK_DISCOUNT_RATE
elif quantity > LARGE_ORDER_THRESHOLD:
return base * quantity * LARGE_ORDER_DISCOUNT_RATE
elif quantity > SMALL_ORDER_THRESHOLD:
return base * quantity * SMALL_ORDER_DISCOUNT_RATE
return base * quantity
def is_valid_password(password: str) -> bool:
return MIN_PASSWORD_LENGTH <= len(password) <= MAX_PASSWORD_LENGTH
Replace Nested Conditionals with Guard Clauses
When: Deeply nested if/else structures, main logic buried
Before:
def process_payment(order: Order, payment: Payment) -> Result:
if order is not None:
if order.is_valid():
if payment is not None:
if payment.amount >= order.total:
if not order.is_cancelled:
# Main logic buried deep
payment.process()
order.mark_paid()
return Result.success()
else:
return Result.error("Order is cancelled")
else:
return Result.error("Insufficient payment")
else:
return Result.error("Payment is required")
else:
return Result.error("Invalid order")
else:
return Result.error("Order is required")
After:
def process_payment(order: Order | None, payment: Payment | None) -> Result:
# Guard clauses - fail fast
if order is None:
return Result.error("Order is required")
if not order.is_valid():
return Result.error("Invalid order")
if order.is_cancelled:
return Result.error("Order is cancelled")
if payment is None:
return Result.error("Payment is required")
if payment.amount < order.total:
return Result.error("Insufficient payment")
# Main logic - clear and unindented
payment.process()
order.mark_paid()
return Result.success()
Introduce Explaining Variable
When: Complex expressions that are hard to understand
Before:
def calculate_shipping(order: Order) -> float:
return (
(order.total * 0.05 if order.total < 100 else order.total * 0.03)
+ (0 if order.customer.is_premium else 5.0)
+ (10.0 if order.destination.country != "US" else 0)
+ (15.0 if sum(item.weight for item in order.items) > 50 else 0)
)
After:
def calculate_shipping(order: Order) -> float:
# Base shipping as percentage of order
base_rate = 0.03 if order.total >= 100 else 0.05
base_shipping = order.total * base_rate
# Premium customers get free handling
handling_fee = 0.0 if order.customer.is_premium else 5.0
# International shipping surcharge
is_international = order.destination.country != "US"
international_fee = 10.0 if is_international else 0.0
# Heavy package surcharge
total_weight = sum(item.weight for item in order.items)
is_heavy = total_weight > 50
weight_fee = 15.0 if is_heavy else 0.0
return base_shipping + handling_fee + international_fee + weight_fee
Replace Constructor with Factory Method
When: Complex object creation logic, need different creation modes
Before:
class Connection:
def __init__(
self,
host: str,
port: int,
use_ssl: bool,
ssl_cert: str | None,
timeout: int,
pool_size: int,
):
self.host = host
self.port = port
self.use_ssl = use_ssl
self.ssl_cert = ssl_cert
self.timeout = timeout
self.pool_size = pool_size
# Complex initialization...
# Usage is complex
conn = Connection("localhost", 5432, False, None, 30, 10)
prod_conn = Connection("db.prod.com", 5432, True, "/path/cert.pem", 60, 50)
After:
class Connection:
def __init__(
self,
host: str,
port: int,
use_ssl: bool = False,
ssl_cert: str | None = None,
timeout: int = 30,
pool_size: int = 10,
):
self.host = host
self.port = port
self.use_ssl = use_ssl
self.ssl_cert = ssl_cert
self.timeout = timeout
self.pool_size = pool_size
@classmethod
def create_local(cls, port: int = 5432) -> "Connection":
"""Create connection for local development."""
return cls(host="localhost", port=port)
@classmethod
def create_production(cls, host: str, ssl_cert: str) -> "Connection":
"""Create secure connection for production."""
return cls(
host=host,
port=5432,
use_ssl=True,
ssl_cert=ssl_cert,
timeout=60,
pool_size=50,
)
@classmethod
def from_url(cls, url: str) -> "Connection":
"""Parse connection from URL string."""
parsed = urlparse(url)
return cls(
host=parsed.hostname or "localhost",
port=parsed.port or 5432,
use_ssl=parsed.scheme == "https",
)
# Usage is clear
conn = Connection.create_local()
prod_conn = Connection.create_production("db.prod.com", "/path/cert.pem")
Complexity Metrics
Cyclomatic Complexity
Measures the number of linearly independent paths through code.
| Score | Meaning | Action |
|---|---|---|
| 1-5 | Low complexity | Good |
| 6-10 | Moderate | Consider simplifying |
| 11-20 | High | Should refactor |
| 21+ | Very high | Must refactor |
How to Reduce Complexity
- Extract Method - Break into smaller functions
- Replace Conditional with Polymorphism - Remove switch statements
- Guard Clauses - Reduce nesting
- Introduce Null Object - Remove null checks
Resources
Python-Specific Refactorings
Use Dataclasses
# Before
class Point:
def __init__(self, x: float, y: float) -> None:
self.x = x
self.y = y
def __eq__(self, other: object) -> bool:
if not isinstance(other, Point):
return False
return self.x == other.x and self.y == other.y
# After
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
Use Pydantic for Validation
# Before
def create_module(data: dict) -> Module:
if "name" not in data:
raise ValueError("name required")
if not isinstance(data.get("version"), str):
raise ValueError("version must be string")
return Module(name=data["name"], version=data["version"])
# After
from pydantic import BaseModel, Field
class ModuleConfig(BaseModel):
name: str = Field(..., min_length=1)
version: str = Field(..., pattern=r"^\d+\.\d+\.\d+$")
Context Managers for Resources
# Before
def process_file(path: str) -> str:
f = open(path, "r")
try:
content = f.read()
return process(content)
finally:
f.close()
# After
def process_file(path: Path) -> str:
with path.open("r") as f:
return process(f.read())
Safety Checklist
Before committing any refactoring:
- All tests pass (
uv run pytest) - Type checking passes (
uv run mypy src/) - Linting passes (
uv run ruff check src/) - No behavior changes introduced
- Changes are incremental and reversible
- Documentation updated if APIs changed
- Performance not degraded
Progress Tracking
After completing refactoring, report improvements:
Refactoring Complete:
- Methods refactored: X
- Complexity reduction: Y%
- Duplication eliminated: Z%
- Test coverage: N%
- All tests passing: ✓
- Type checking: ✓
- Linting: ✓
Guidelines
- Always have tests before refactoring
- Make one change at a time
- Commit after each successful refactoring
- Run tests after every change
- If tests fail, revert immediately
- Never refactor and add features in the same commit
- Document significant structural changes
- Use dataclasses to replace manual
__init__+__eq__ - Use Pydantic to replace manual validation logic
- Use context managers for resource management