Code Smell Detector

Overview

Identify code quality and design smells in Python codebases, then provide specific refactoring recommendations to improve maintainability and design.

Workflow

1. Understand the Analysis Scope

Define what to analyze:

Questions to ask:

What directory or files should be analyzed?
Focus on quality smells, design smells, or both?
Are there specific concerns (e.g., "this class is too complex")?
Should test files be included?

Determine scope:

# Check project structure
ls -la

# Count Python files
find . -name "*.py" | wc -l

# Identify large files (potential smells)
find . -name "*.py" -exec wc -l {} + | sort -rn | head -10

2. Detect Code Smells

Use multiple detection strategies.

Strategy 1: Automated Detection

Use the bundled script for AST-based analysis:

# Scan entire project
python scripts/detect_smells.py /path/to/project

# Exclude specific directories
python scripts/detect_smells.py /path/to/project venv,tests,docs

What it detects:

Long methods (>50 lines)
Too many parameters (>5)
Large classes (>15 methods)
God classes (>20 methods)
Magic numbers

Strategy 2: Manual Code Review

Read the code to identify design smells. See smell-patterns.md for comprehensive catalog.

Look for:

Code Quality Smells:

Duplicate code blocks
Magic numbers (unexplained numeric literals)
Hardcoded values (paths, URLs, config)
Commented-out code
Inconsistent naming

Design Smells:

God classes (too many responsibilities)
Feature envy (method uses more from another class)
Inappropriate intimacy (classes too coupled)
Data clumps (same parameters repeated)
Primitive obsession (using primitives instead of objects)
Long parameter lists (>5 parameters)

Search patterns:

# Find long files (potential large classes)
find . -name "*.py" -exec wc -l {} + | awk '$1 > 300'

# Find magic numbers (basic pattern)
grep -r "[^0-9]\d\{3,\}" --include="*.py" .

# Find hardcoded paths
grep -r '"/.*/"' --include="*.py" .

# Find commented code
grep -r "^[ ]*#.*def \|^[ ]*#.*class " --include="*.py" .

Strategy 3: Use External Tools

radon - Complexity metrics:

# Install
pip install radon

# Check cyclomatic complexity
radon cc /path/to/project -a

# Maintainability index
radon mi /path/to/project

# Show only complex functions
radon cc /path/to/project -nc

pylint - Code quality:

pip install pylint

# Check for code smells
pylint /path/to/project --disable=C0111  # Disable docstring warnings

3. Categorize Smells

Organize findings by severity and type.

See smell-patterns.md for detailed patterns.

High Severity

Immediate attention needed:

God classes (>20 methods, multiple responsibilities)
Shotgun surgery (changes ripple across many files)
Feature envy (method belongs in different class)
Long parameter lists (>7 parameters)

Medium Severity

Should refactor soon:

Large classes (>15 methods)
Duplicate code
Data clumps
Primitive obsession
Inappropriate intimacy

Low Severity

Nice to improve:

Magic numbers
Hardcoded values
Inconsistent naming
Lazy classes
Commented-out code

4. Identify Refactorings

For each smell, determine appropriate refactoring.

See refactoring-patterns.md for detailed examples.

Common mappings:

Smell	Refactoring
God class	Extract Class, Extract Service
Feature envy	Move Method
Duplicate code	Extract Method, Pull Up Method
Data clumps	Introduce Parameter Object
Magic numbers	Replace with Symbolic Constant
Long parameter list	Introduce Parameter Object
Primitive obsession	Replace Data Value with Object
Inappropriate intimacy	Move Method, Hide Delegate
Shotgun surgery	Move Method, Inline Class
Lazy class	Inline Class, Collapse Hierarchy

5. Generate Report

Create a structured markdown report with specific refactoring recommendations.

Code Smell Analysis Report

Project: [Project Name] Analyzed: [Date] Scope: [Directories analyzed] Excluded: [Excluded directories]

Summary

High severity smells: X issues
Medium severity smells: Y issues
Low severity smells: Z issues

Total: N code smells detected

🔴 High Severity Smells

Smell 1: God Class

Location: src/services/user_manager.py:15

Class: UserManager

Description: Class has 28 methods handling multiple unrelated responsibilities (user CRUD, authentication, email, logging, analytics).

Impact:

Violates Single Responsibility Principle
Hard to test and maintain
Changes ripple across unrelated features

Refactoring: Extract Class

Recommendation:

Split into focused classes by responsibility:

# Before: God class with 28 methods
class UserManager:
    def create_user(self, data): pass
    def update_user(self, user_id, data): pass
    def delete_user(self, user_id): pass
    def authenticate(self, username, password): pass
    def hash_password(self, password): pass
    def send_welcome_email(self, user): pass
    def send_password_reset(self, user): pass
    def log_activity(self, user, action): pass
    def get_statistics(self, user): pass
    # ... 19 more methods

# After: Split by responsibility
class UserRepository:
    """Handles user persistence."""
    def create(self, data): pass
    def update(self, user_id, data): pass
    def delete(self, user_id): pass
    def find_by_id(self, user_id): pass

class UserAuthService:
    """Handles authentication."""
    def authenticate(self, username, password): pass
    def hash_password(self, password): pass
    def validate_password_strength(self, password): pass

class UserNotificationService:
    """Handles user notifications."""
    def send_welcome_email(self, user): pass
    def send_password_reset(self, user): pass

class UserAnalyticsService:
    """Handles user analytics."""
    def log_activity(self, user, action): pass
    def get_statistics(self, user): pass

Priority: High - Refactor within 1-2 sprints

Smell 2: Feature Envy

Location: src/models/order.py:45

Method: Order.calculate_total()

Description: Method uses customer features heavily (discount_rate, is_premium, shipping_address) rather than its own class features.

Impact:

Poor cohesion - method is in wrong class
Changes to Customer affect Order
Violates "tell, don't ask" principle

Refactoring: Move Method

Recommendation:

Move calculation logic to Customer class:

# Before: Feature envy
class Order:
    def calculate_total(self):
        discount = self.customer.get_discount_rate()
        is_premium = self.customer.is_premium_member()
        address = self.customer.get_shipping_address()

        total = sum(item.price for item in self.items)
        if is_premium:
            total *= (1 - discount)
        if address.country != 'US':
            total += 50
        return total

# After: Move to appropriate class
class Customer:
    def calculate_order_total(self, items):
        total = sum(item.price for item in items)
        if self.is_premium_member():
            total *= (1 - self.get_discount_rate())
        if self.shipping_address.country != 'US':
            total += 50
        return total

class Order:
    def calculate_total(self):
        return self.customer.calculate_order_total(self.items)

Priority: High - Refactor within 2 weeks

🟡 Medium Severity Smells

Smell 3: Duplicate Code

Locations:

src/validators/user_validator.py:23-35
src/validators/profile_validator.py:45-57

Description: Identical email and name validation logic appears in two validators.

Impact:

Changes must be made in multiple places
Risk of inconsistent validation
Maintenance burden

Refactoring: Extract Method

Recommendation:

Extract common validation into shared utility:

# Before: Duplicate code
class UserValidator:
    def validate(self, data):
        if not data.get('email') or '@' not in data['email']:
            raise ValueError("Invalid email")
        if not data.get('name') or len(data['name']) < 2:
            raise ValueError("Invalid name")
        # ... more validation

class ProfileValidator:
    def validate(self, data):
        if not data.get('email') or '@' not in data['email']:
            raise ValueError("Invalid email")
        if not data.get('name') or len(data['name']) < 2:
            raise ValueError("Invalid name")
        # ... more validation

# After: Extract common validation
class ValidationHelpers:
    @staticmethod
    def validate_email(email):
        if not email or '@' not in email:
            raise ValueError("Invalid email")

    @staticmethod
    def validate_name(name):
        if not name or len(name) < 2:
            raise ValueError("Invalid name")

class UserValidator:
    def validate(self, data):
        ValidationHelpers.validate_email(data.get('email'))
        ValidationHelpers.validate_name(data.get('name'))
        # ... more validation

class ProfileValidator:
    def validate(self, data):
        ValidationHelpers.validate_email(data.get('email'))
        ValidationHelpers.validate_name(data.get('name'))
        # ... more validation

Priority: Medium - Refactor within month

Smell 4: Data Clumps

Locations:

src/services/email_service.py:12 (6 parameters)
src/services/sms_service.py:23 (6 parameters)
src/services/notification_service.py:34 (6 parameters)

Description: Same parameter group (name, email, phone, address_street, address_city, address_zip) appears in multiple methods.

Impact:

Suggests missing abstraction
Hard to maintain - changes affect many signatures
Easy to pass wrong parameters

Refactoring: Introduce Parameter Object

Recommendation:

Create ContactInfo value object:

# Before: Data clumps
def send_email(name, email, phone, street, city, zip_code):
    pass

def send_sms(name, email, phone, street, city, zip_code):
    pass

def send_notification(name, email, phone, street, city, zip_code):
    pass

# After: Parameter object
from dataclasses import dataclass

@dataclass
class Address:
    street: str
    city: str
    zip_code: str

@dataclass
class ContactInfo:
    name: str
    email: str
    phone: str
    address: Address

def send_email(contact: ContactInfo):
    pass

def send_sms(contact: ContactInfo):
    pass

def send_notification(contact: ContactInfo):
    pass

Priority: Medium - Refactor within month

🔵 Low Severity Smells

Smell 5: Magic Numbers

Location: src/billing/calculator.py:67-78

Description: Multiple unexplained numeric literals (1000, 0.15, 500, 0.10, 0.05).

Impact:

Unclear business rules
Hard to change discount thresholds
Risk of typos

Refactoring: Replace Magic Number with Symbolic Constant

Recommendation:

Use named constants:

# Before: Magic numbers
def calculate_discount(price):
    if price > 1000:
        return price * 0.15
    elif price > 500:
        return price * 0.10
    else:
        return price * 0.05

# After: Named constants
BULK_ORDER_THRESHOLD = 1000
LARGE_ORDER_THRESHOLD = 500
BULK_DISCOUNT_RATE = 0.15
LARGE_DISCOUNT_RATE = 0.10
STANDARD_DISCOUNT_RATE = 0.05

def calculate_discount(price):
    if price > BULK_ORDER_THRESHOLD:
        return price * BULK_DISCOUNT_RATE
    elif price > LARGE_ORDER_THRESHOLD:
        return price * LARGE_DISCOUNT_RATE
    else:
        return price * STANDARD_DISCOUNT_RATE

Priority: Low - Refactor when touching this code

Recommendations

Immediate Actions (High Priority)

Refactor UserManager god class into separate services
Move Order.calculate_total() to Customer class
Address feature envy in payment processing

Short-term Actions (Medium Priority)

Extract duplicate validation logic
Introduce ContactInfo parameter object
Refactor large classes (>15 methods)

Long-term Actions (Low Priority)

Replace magic numbers with named constants
Standardize naming conventions
Remove commented-out code

Prevention Strategies

Add to CI/CD:

# Complexity checks
radon cc src/ -nc  # Fail on complex functions

# Code quality
pylint src/ --fail-under=8.0

Code review checklist:

Regular audits:

Weekly: Review new code for smells
Monthly: Full codebase smell detection
Quarterly: Major refactoring sprints

6. Present Findings

Share report with team and discuss refactoring priorities.

Presentation tips:

Start with summary statistics
Focus on high-severity smells first
Show before/after code examples
Discuss business impact
Prioritize based on team capacity

Get team input:

"Does this class actually have too many responsibilities?"
"Is this refactoring worth the effort?"
"Should we tackle this now or later?"

Tips for Effective Smell Detection

Start with obvious smells:

Focus on metrics first (long methods, large classes)
Look for duplicate code
Check for magic numbers and hardcoded values

Consider context:

Some "smells" may be justified
Legacy code may have historical reasons
Performance-critical code may break rules

Prioritize by impact:

High impact: God classes, shotgun surgery
Medium impact: Duplicate code, data clumps
Low impact: Magic numbers, naming issues

Refactor incrementally:

Don't refactor everything at once
Focus on areas changing frequently
Test after each refactoring step

Use tools as guides:

Tools find potential issues
Human judgment determines if it's truly a smell
Team decides refactoring priorities

Common False Positives

Not every pattern is a smell:

Long methods may be acceptable:

Sequential data processing pipelines
Complex algorithms better kept together
Generated code

Large classes may be justified:

Façade pattern intentionally centralizes
DTO/Entity classes with many fields
Framework base classes

Magic numbers may be OK:

Well-known constants (0, 1, 100, 1024)
Index operations
Test data

Consider trade-offs:

Refactoring has costs (time, risk, testing)
Some smells aren't worth fixing
Balance purity with pragmatism

Reference

For comprehensive smell patterns and refactoring techniques:

smell-patterns.md - Detailed catalog of code smells
refactoring-patterns.md - Refactoring techniques with examples

code-smell-detector

Code Smell Detector

Overview

Workflow

1. Understand the Analysis Scope

2. Detect Code Smells

Strategy 1: Automated Detection

Strategy 2: Manual Code Review

Strategy 3: Use External Tools

3. Categorize Smells

High Severity

Medium Severity

Low Severity

4. Identify Refactorings

5. Generate Report

Code Smell Analysis Report

Summary

🔴 High Severity Smells

Smell 1: God Class

Smell 2: Feature Envy

🟡 Medium Severity Smells

Smell 3: Duplicate Code

Smell 4: Data Clumps

🔵 Low Severity Smells

Smell 5: Magic Numbers

Recommendations

Immediate Actions (High Priority)

Short-term Actions (Medium Priority)

Long-term Actions (Low Priority)

Prevention Strategies

6. Present Findings

Tips for Effective Smell Detection

Common False Positives

Reference