Test-Guided Bug Detector

Analyze failing tests to detect and explain functional bugs in code.

Overview

When tests fail, they provide valuable clues about bugs in the code. This skill analyzes:

Test failure output - Error messages, stack traces, assertion failures
Test expectations - What the test expects to happen
Actual behavior - What actually happened
Code execution path - Which code was executed
Suspicious patterns - Common bug patterns that match the failure

The goal is to identify the root cause bug and explain why the test exposes it.

Bug Detection Workflow

Failing Test Output
    ↓
Parse Failure Information
    ↓
Identify Test Expectations
    ↓
Trace Execution Path
    ↓
Analyze Discrepancy
    ↓
Identify Suspicious Code
    ↓
Explain Bug Mechanism
    ↓
Suggest Fix

Analysis Process

Step 1: Parse Test Failure

Extract key information from test output:

What to extract:

Test name and location
Failure type (assertion, exception, timeout, etc.)
Expected vs actual values
Stack trace
Error messages

Example:

FAILED tests/test_calculator.py::test_divide - AssertionError: assert 0 == 5
Expected: 5
Actual: 0

Stack trace:
  File "tests/test_calculator.py", line 15, in test_divide
    assert divide(10, 2) == 5
  File "src/calculator.py", line 8, in divide
    return a // b

Step 2: Understand Test Intent

Determine what the test is trying to verify:

Questions to answer:

What functionality is being tested?
What are the inputs?
What is the expected output?
What properties should hold?

Example:

def test_divide():
    # Intent: Verify division returns correct result
    result = divide(10, 2)
    assert result == 5  # Expects 10 / 2 = 5

Step 3: Trace Execution Path

Follow the code path from test to failure:

Trace elements:

Function calls in stack trace
Control flow decisions
Data transformations
Return values

Example trace:

test_divide()
  → divide(10, 2)
    → return a // b  (integer division)
    → returns 5
  → assert 5 == 5  ✓ Should pass!

Step 4: Identify Discrepancy

Find where expected and actual diverge:

Common discrepancies:

Wrong operator (// vs /)
Off-by-one errors
Null/None handling
Type mismatches
Logic errors

Example:

# Expected: 10 / 2 = 5.0
# Actual: 10 // 2 = 5 (but test got 0?)
# Discrepancy: Something else is wrong!

Step 5: Analyze Suspicious Code

Examine code for bug patterns:

Bug patterns to check:

Uninitialized variables
Wrong operators
Missing return statements
Incorrect conditions
Edge case handling

Example analysis:

def divide(a, b):
    result = 0  # BUG: Initialized but never updated!
    return a // b  # This line is unreachable? No, wait...
    # Actually, this returns correctly, but...

Step 6: Explain Bug Mechanism

Describe how the bug causes the failure:

Explanation structure:

What the code does
What it should do
Why there's a mismatch
How the test exposes it

Step 7: Suggest Fix

Propose concrete fix with explanation:

Fix components:

Code change
Why it fixes the bug
How to verify the fix

Common Bug Patterns

For detailed bug patterns and detection strategies, see references/bug_patterns.md.

Categories include:

Logic errors (wrong operators, conditions)
State management (uninitialized, stale state)
Boundary conditions (off-by-one, edge cases)
Type errors (implicit conversions, null handling)
Concurrency bugs (race conditions, deadlocks)

Failure Type Analysis

For analyzing different types of test failures, see references/failure_types.md.

Failure types:

Assertion failures
Exceptions and errors
Timeouts
Unexpected behavior
Flaky tests

Example Analysis

Input: Failing test

# Test file: tests/test_list_utils.py
def test_remove_duplicates():
    input_list = [1, 2, 2, 3, 3, 3, 4]
    result = remove_duplicates(input_list)
    assert result == [1, 2, 3, 4]
    assert input_list == [1, 2, 2, 3, 3, 3, 4]  # Original unchanged

# Test output:
# FAILED - AssertionError: assert [1, 2, 3, 4] == [1, 2, 2, 3, 3, 3, 4]
# The second assertion failed!

# Implementation: src/list_utils.py
def remove_duplicates(lst):
    seen = set()
    i = 0
    while i < len(lst):
        if lst[i] in seen:
            lst.pop(i)  # BUG: Modifies input list!
        else:
            seen.add(lst[i])
            i += 1
    return lst

Output: Bug analysis

# Bug Analysis Report

## Test Failure Summary

**Test:** test_remove_duplicates
**Location:** tests/test_list_utils.py:2
**Failure Type:** Assertion failure
**Failed Assertion:** `assert input_list == [1, 2, 2, 3, 3, 3, 4]`

## Expected vs Actual

**Expected:** Original list unchanged: `[1, 2, 2, 3, 3, 3, 4]`
**Actual:** Original list modified: `[1, 2, 3, 4]`

## Root Cause

**Bug Location:** src/list_utils.py:7
**Bug Type:** Unintended side effect (input mutation)

**Problematic Code:**
```python
lst.pop(i)  # Modifies the input list directly

Bug Mechanism

What happens: The function modifies the input list in-place using lst.pop(i)
Why it's wrong: The test expects the original list to remain unchanged
How test exposes it: Second assertion checks that input_list is unmodified
Why it fails: Since Python passes lists by reference, modifications to lst affect the original input_list

Execution Trace

test_remove_duplicates()
  input_list = [1, 2, 2, 3, 3, 3, 4]
  ↓
  remove_duplicates(input_list)  # lst points to same list as input_list
    i=0: lst[0]=1, not in seen, add to seen, i=1
    i=1: lst[1]=2, not in seen, add to seen, i=2
    i=2: lst[2]=2, in seen, lst.pop(2)  # Removes from input_list!
    # Now lst = input_list = [1, 2, 3, 3, 3, 4]
    i=2: lst[2]=3, not in seen, add to seen, i=3
    i=3: lst[3]=3, in seen, lst.pop(3)  # Removes from input_list!
    # Now lst = input_list = [1, 2, 3, 3, 4]
    i=3: lst[3]=3, in seen, lst.pop(3)  # Removes from input_list!
    # Now lst = input_list = [1, 2, 3, 4]
    i=3: lst[3]=4, not in seen, add to seen, i=4
    return lst  # Returns [1, 2, 3, 4]
  ↓
  result = [1, 2, 3, 4]  ✓ First assertion passes
  input_list = [1, 2, 3, 4]  ✗ Second assertion fails!

Suspicious Code Regions

Primary Suspect: src/list_utils.py:7

lst.pop(i)  # Direct mutation of input

Suspicion Level: HIGH Reason: Modifies input list, violating immutability expectation

Secondary Suspect: src/list_utils.py:11

return lst  # Returns reference to modified input

Suspicion Level: MEDIUM Reason: Returns same object as input, not a new list

Recommended Fix

Option 1: Create a copy (Recommended)

def remove_duplicates(lst):
    result = []  # Create new list
    seen = set()
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

Why this fixes it:

Creates new list instead of modifying input
Original list remains unchanged
Clearer intent

Option 2: Explicit copy

def remove_duplicates(lst):
    lst = lst.copy()  # Work on a copy
    seen = set()
    i = 0
    while i < len(lst):
        if lst[i] in seen:
            lst.pop(i)
        else:
            seen.add(lst[i])
            i += 1
    return lst

Why this fixes it:

lst.copy() creates a shallow copy
Modifications don't affect original
Preserves original algorithm structure

Verification

To verify the fix:

Run the failing test: pytest tests/test_list_utils.py::test_remove_duplicates
Both assertions should pass
Add additional test for immutability:

def test_remove_duplicates_immutable():
    original = [1, 2, 2, 3]
    original_copy = original.copy()
    result = remove_duplicates(original)
    assert original == original_copy  # Verify no mutation

Related Issues

This bug could affect:

Any code that assumes remove_duplicates doesn't modify input
Functions that reuse the input list after calling remove_duplicates
Concurrent code where multiple threads access the same list


## Analysis Strategies

For detailed analysis strategies by language and framework, see [references/analysis_strategies.md](references/analysis_strategies.md).

Strategies include:
- Python (pytest, unittest)
- JavaScript (Jest, Mocha)
- Java (JUnit)
- C/C++ (Google Test)
- Go (testing package)

## Best Practices

1. **Start with the failure message** - It often points directly to the bug
2. **Understand test intent** - Know what should happen
3. **Trace execution carefully** - Follow the actual code path
4. **Look for common patterns** - Many bugs follow known patterns
5. **Consider edge cases** - Bugs often hide at boundaries
6. **Check assumptions** - Verify what the code assumes
7. **Explain clearly** - Make the bug mechanism understandable

## Red Flags

Watch for these suspicious patterns:

**High-priority red flags:**
- Uninitialized variables
- Missing return statements
- Wrong operators (==  vs =, // vs /)
- Off-by-one errors (< vs <=)
- Null/None without checks
- Mutable default arguments
- Side effects in pure functions

**Medium-priority warnings:**
- Complex conditionals
- Nested loops with breaks
- Exception swallowing
- Type conversions
- Global state access

## Report Template

```markdown
# Bug Analysis Report

## Test Failure Summary
- Test name and location
- Failure type
- Failed assertion/error

## Expected vs Actual
- What should happen
- What actually happened

## Root Cause
- Bug location (file:line)
- Bug type
- Problematic code snippet

## Bug Mechanism
- Step-by-step explanation
- Why it's wrong
- How test exposes it

## Execution Trace
- Detailed trace from test to failure
- Variable values at key points

## Suspicious Code Regions
- Primary suspects with evidence
- Secondary suspects

## Recommended Fix
- Proposed code change
- Explanation of why it fixes the bug
- How to verify

## Related Issues
- Other code that might be affected

Additional Resources

For detailed guidance:

Bug Patterns - Common bug patterns and detection
Failure Types - Analyzing different failure types
Analysis Strategies - Language-specific strategies

test-guided-bug-detector

Test-Guided Bug Detector

Overview

Bug Detection Workflow

Analysis Process

Step 1: Parse Test Failure

Step 2: Understand Test Intent

Step 3: Trace Execution Path

Step 4: Identify Discrepancy

Step 5: Analyze Suspicious Code

Step 6: Explain Bug Mechanism

Step 7: Suggest Fix

Common Bug Patterns

Failure Type Analysis

Example Analysis

Bug Mechanism

Execution Trace

Suspicious Code Regions

Primary Suspect: src/list_utils.py:7

Secondary Suspect: src/list_utils.py:11

Recommended Fix

Verification

Related Issues

Additional Resources