Symbolic Execution Assistant

Perform symbolic execution analysis to detect errors and generate test inputs by exploring program paths with symbolic variables.

What is Symbolic Execution?

Symbolic execution executes code with symbolic values (representing any possible value) instead of concrete values. This allows exploring multiple execution paths simultaneously and detecting errors that might only occur with specific inputs.

Key Concepts:

Symbolic variables: Variables with unknown values (e.g., α, β instead of 5, 10)
Path constraints: Conditions accumulated along each execution path
Path explosion: Number of paths grows exponentially with branches
Constraint solver: Tool (like Z3) that finds concrete values satisfying constraints

Workflow

Step 1: Identify the Function to Analyze

Select the function and determine what to analyze for.

Questions to ask:

What bugs might this function have? (null refs, div by zero, overflows, assertions)
What inputs could trigger errors?
Which execution paths are critical?
Are there complex conditionals that need exploration?

Example:

def calculate_discount(price, customer_type):
    """Calculate discount based on customer type."""
    if customer_type == "premium":
        discount = price * 0.2
    elif customer_type == "regular":
        discount = price * 0.1
    else:
        discount = 0

    final_price = price - discount
    return final_price

Analysis goals:

Explore all three branches (premium, regular, other)
Check for potential arithmetic errors
Generate test inputs for each path

Step 2: Set Up Symbolic Variables

Replace concrete inputs with symbolic variables.

Manual Symbolic Execution:

Input: price = α (symbolic), customer_type = β (symbolic)
Initial constraints: α ∈ ℝ, β ∈ String
Initial state: { price: α, customer_type: β }

Using Python with Z3:

from z3 import *

# Create symbolic variables
price = Real('price')
customer_type = String('customer_type')

# Create solver
solver = Solver()

Using Java with Symbolic PathFinder (SPF):

// Annotate symbolic inputs
public static void calculate_discount(double price, String customer_type) {
    // SPF will make these symbolic via configuration
}

Using C with KLEE:

#include <klee/klee.h>

int main() {
    float price;
    char customer_type[20];

    klee_make_symbolic(&price, sizeof(price), "price");
    klee_make_symbolic(customer_type, sizeof(customer_type), "customer_type");

    calculate_discount(price, customer_type);
    return 0;
}

Step 3: Execute Symbolically and Build Path Tree

Trace through code, tracking constraints for each branch.

Manual Execution Example:

State 0: { price: α, customer_type: β }
Path constraint: (none)

Branch 1: customer_type == "premium"
  State 1a: { discount: α * 0.2, final_price: α - (α * 0.2) }
  Path constraint: β = "premium"

Branch 2: customer_type == "regular"
  State 1b: { discount: α * 0.1, final_price: α - (α * 0.1) }
  Path constraint: β ≠ "premium" ∧ β = "regular"

Branch 3: else
  State 1c: { discount: 0, final_price: α }
  Path constraint: β ≠ "premium" ∧ β ≠ "regular"

Path Tree Visualization:

                    [Initial State]
                     price = α
                  customer_type = β
                         |
        +----------------+----------------+
        |                |                |
  β = "premium"    β = "regular"      else
        |                |                |
  discount=α*0.2   discount=α*0.1    discount=0
  Path 1           Path 2            Path 3

For detailed path tree construction techniques, see references/path_exploration.md.

Step 4: Identify Error Conditions

Look for states where errors could occur.

Common Error Patterns:

Error Type	Check For	Example Constraint
Division by zero	denominator == 0	x / y where y = 0
Null dereference	variable == null	obj.method() where obj = null
Buffer overflow	index >= array.length	arr[i] where i ≥ len(arr)
Assertion violation	assertion condition false	assert x > 0 where x ≤ 0
Integer overflow	result > MAX_INT	a + b > 2³¹-1
Negative array index	index < 0	arr[i] where i < 0

Example with Division by Zero:

def safe_divide(a, b):
    if b != 0:
        return a / b
    else:
        return None

Symbolic execution:

State 0: { a: α, b: β }

Branch 1: b != 0
  Path constraint: β ≠ 0
  Result: α / β (safe)

Branch 2: b == 0
  Path constraint: β = 0
  Result: None (safe)

ERROR CHECK: Division by zero?
  Constraint: β = 0 AND execution reaches "a / b"
  Result: NO (the if-check prevents it)

Example with Null Dereference:

public int getLength(String str) {
    if (str != null) {
        return str.length();
    }
    return 0;
}

Symbolic execution:

State 0: { str: α }

Branch 1: str != null
  Path constraint: α ≠ null
  Result: α.length() (safe)

Branch 2: str == null
  Path constraint: α = null
  Result: 0 (safe)

ERROR CHECK: Null dereference?
  Constraint: α = null AND execution reaches str.length()
  Result: NO (protected by null check)

Step 5: Solve Constraints for Test Inputs

Use constraint solver to find concrete values that exercise each path or trigger errors.

Manual Constraint Solving:

For simple constraints, solve manually:

Path 1 constraint: β = "premium"
Solution: price = 100, customer_type = "premium"

Path 2 constraint: β ≠ "premium" ∧ β = "regular"
Solution: price = 100, customer_type = "regular"

Path 3 constraint: β ≠ "premium" ∧ β ≠ "regular"
Solution: price = 100, customer_type = "guest"

Using Z3 Solver (Python):

from z3 import *

# Define symbolic variables
price = Real('price')
customer_type = String('customer_type')

# Solve for Path 1: premium customer
solver = Solver()
solver.add(customer_type == StringVal("premium"))
solver.add(price > 0)  # Add reasonable constraints

if solver.check() == sat:
    model = solver.model()
    print(f"Test input for Path 1: price={model[price]}, customer_type={model[customer_type]}")

# Solve for Path 2: regular customer
solver2 = Solver()
solver2.add(customer_type == StringVal("regular"))
solver2.add(price > 0)

if solver2.check() == sat:
    model = solver2.model()
    print(f"Test input for Path 2: price={model[price]}, customer_type={model[customer_type]}")

Using Z3 for Error Detection:

# Check for division by zero
a = Int('a')
b = Int('b')

solver = Solver()
solver.add(b == 0)  # Error condition: divisor is zero
solver.add(a > 0)   # Additional context

if solver.check() == sat:
    model = solver.model()
    print(f"ERROR: Division by zero possible with a={model[a]}, b={model[b]}")

For comprehensive constraint solving techniques, see references/constraint_solving.md.

Step 6: Generate Test Cases

Convert solved constraints into executable test cases.

Test Case Template:

import pytest

class TestCalculateDiscount:
    # Path 1: Premium customer
    def test_premium_customer(self):
        """Test premium customer path."""
        # Generated from constraint: customer_type = "premium"
        result = calculate_discount(100, "premium")
        assert result == 80  # 100 - 20% discount

    # Path 2: Regular customer
    def test_regular_customer(self):
        """Test regular customer path."""
        # Generated from constraint: customer_type = "regular"
        result = calculate_discount(100, "regular")
        assert result == 90  # 100 - 10% discount

    # Path 3: Other customer type
    def test_other_customer(self):
        """Test other customer type path."""
        # Generated from constraint: customer_type ∉ {"premium", "regular"}
        result = calculate_discount(100, "guest")
        assert result == 100  # No discount

Java Test Cases:

import org.junit.Test;
import static org.junit.Assert.*;

public class TestCalculateDiscount {
    @Test
    public void testPremiumCustomer() {
        // Path 1: Premium customer
        double result = calculateDiscount(100.0, "premium");
        assertEquals(80.0, result, 0.01);
    }

    @Test
    public void testRegularCustomer() {
        // Path 2: Regular customer
        double result = calculateDiscount(100.0, "regular");
        assertEquals(90.0, result, 0.01);
    }

    @Test
    public void testOtherCustomer() {
        // Path 3: Other customer
        double result = calculateDiscount(100.0, "guest");
        assertEquals(100.0, result, 0.01);
    }
}

Step 7: Report Findings

Document discovered paths, errors, and generated tests.

Report Template:

# Symbolic Execution Report: calculate_discount

## Function Analyzed
`calculate_discount(price, customer_type)`

## Paths Discovered
- **Path 1**: Premium customer (customer_type = "premium")
  - Constraint: β = "premium"
  - Behavior: 20% discount applied
  - Test input: price=100, customer_type="premium"

- **Path 2**: Regular customer (customer_type = "regular")
  - Constraint: β ≠ "premium" ∧ β = "regular"
  - Behavior: 10% discount applied
  - Test input: price=100, customer_type="regular"

- **Path 3**: Other customer types
  - Constraint: β ∉ {"premium", "regular"}
  - Behavior: No discount
  - Test input: price=100, customer_type="guest"

## Errors Detected
None. All paths are safe.

## Generated Test Cases
3 test cases generated (see test_calculate_discount.py)

## Coverage
- Branch coverage: 100% (all 3 branches)
- Path coverage: 100% (all 3 paths)

## Recommendations
- Consider validating customer_type against known values
- Add explicit error handling for negative prices

Symbolic Execution Tools

Python Tools

Z3 Theorem Prover:

pip install z3-solver

angr (Binary Analysis Framework):

pip install angr

Crosshair (Symbolic Testing):

pip install crosshair-tool

Java Tools

Symbolic PathFinder (SPF):

Extension of Java PathFinder
Symbolic execution for Java bytecode
Configuration via .jpf files

JDart:

Dynamic symbolic execution for Java
Integrates with DART framework

C/C++ Tools

KLEE:

# Uses LLVM bitcode
clang -emit-llvm -c program.c -o program.bc
klee program.bc

Symbolic Execution Engine (SEE):

Symbolic execution for C programs
Built on top of LLVM

For detailed tool setup and usage, see references/tool_integration.md.

Handling Path Explosion

Path explosion occurs when the number of paths grows exponentially.

Mitigation Strategies:

Bounded Execution: Limit search depth

# Limit loop iterations
for i in range(min(len(array), 10)):  # Max 10 iterations
    process(array[i])

Path Pruning: Eliminate infeasible paths early

if not is_feasible(path_constraint):
    prune_path()

State Merging: Combine similar states

# Merge states with same program counter
if state1.pc == state2.pc:
    merged_state = merge(state1, state2)

Selective Exploration: Focus on critical paths

# Prioritize paths with error conditions
if contains_error_check(path):
    explore_first(path)

Concolic Execution: Mix concrete and symbolic execution

# Start with concrete value, switch to symbolic when needed
x = 5  # Concrete initially
if complex_condition(x):
    x = make_symbolic(x)  # Switch to symbolic

Tips

Start small: Analyze simple functions before complex ones
Focus on critical code: Prioritize security-sensitive or error-prone code
Use tools when possible: Manual symbolic execution is labor-intensive
Set time limits: Path explosion can make analysis impractical
Combine techniques: Use both manual analysis and automated tools
Validate generated tests: Ensure tests actually run and make sense
Document assumptions: Note any simplifications or constraints

Common Use Cases

1. Bug Detection

Find null pointer dereferences
Detect division by zero
Identify buffer overflows
Catch assertion violations

2. Test Generation

Generate inputs for full path coverage
Create edge case tests
Produce regression test suites

3. Security Analysis

Find exploitable vulnerabilities
Detect integer overflows
Identify injection points

4. Equivalence Checking

Verify refactored code behaves identically
Check optimization correctness

References

For detailed information on specific topics:

references/path_exploration.md - Path tree construction and exploration strategies
references/constraint_solving.md - Z3 solver usage and constraint patterns
references/tool_integration.md - Setup and usage of KLEE, angr, SPF, and other tools

symbolic-execution-assistant