Reproduction Trace Instrumenter

Overview

This skill instruments source code to capture detailed execution traces for bug reproduction. It records function calls, variable values, control flow, and program state, then generates replay scripts to deterministically reproduce the bug for diagnosis.

Workflow

1. Identify the Bug Context

Before instrumentation, understand:

What is the bug or failure being investigated?
Which code paths are likely involved?
What inputs trigger the bug?
Is the bug deterministic or intermittent?

2. Instrument the Code

Use the appropriate instrumenter for your language:

Python Instrumentation

python scripts/python_instrumenter.py <source_file.py> -o <instrumented_file.py>

Options:

--no-functions: Disable function call tracing
--no-variables: Disable variable assignment tracing
--no-control-flow: Disable control flow tracing
--exclude <patterns>: Exclude functions matching patterns (e.g., __init__ test_*)

Example:

# Full instrumentation
python scripts/python_instrumenter.py app.py -o app_instrumented.py

# Minimal instrumentation (functions only)
python scripts/python_instrumenter.py app.py -o app_instrumented.py --no-variables --no-control-flow

# Exclude test functions
python scripts/python_instrumenter.py app.py -o app_instrumented.py --exclude test_ __

3. Run the Instrumented Code

Execute the instrumented program with the inputs that trigger the bug:

python app_instrumented.py

The execution trace will be automatically saved to trace.json when the program exits.

Trace Output:

trace.json: Complete execution trace with all recorded events
Console output: Summary of trace recording

4. Analyze the Trace

Generate a human-readable summary:

python scripts/replay_generator.py trace.json --summary

This shows:

Total number of events
Event type distribution
Function call sequence
Maximum call depth

5. Generate Replay Script

Create a replay script to reproduce the bug:

python scripts/replay_generator.py trace.json -o replay.py

Run the replay script:

python replay.py

The replay script executes the same sequence of operations, allowing you to:

Reproduce the bug consistently
Add breakpoints at specific steps
Modify values to test hypotheses
Understand the execution flow

Configuration

Use the trace configuration template to customize instrumentation:

cp assets/trace_config_template.json trace_config.json
# Edit trace_config.json as needed

Key Configuration Options:

Instrumentation Level:

trace_functions: Record function entry/exit
trace_variables: Record variable assignments
trace_control_flow: Record if/else, loops
trace_exceptions: Record exception handling

Filtering:

exclude_patterns: Function name patterns to skip
exclude_modules: Modules to skip entirely
max_string_length: Truncate long strings
max_call_depth: Limit trace depth

Performance:

buffer_size: Events to buffer before writing
async_write: Write traces asynchronously
max_trace_size_mb: Maximum trace file size

Instrumentation Levels

Choose the appropriate level based on your needs:

Minimal (Functions Only)

python scripts/python_instrumenter.py app.py -o app_inst.py --no-variables --no-control-flow

Overhead: 5-15%
Use when: You need to understand call sequence only
Trace size: Small

Standard (Functions + Variables)

python scripts/python_instrumenter.py app.py -o app_inst.py --no-control-flow

Overhead: 20-50%
Use when: You need to track state changes
Trace size: Medium

Full (Everything)

python scripts/python_instrumenter.py app.py -o app_inst.py

Overhead: 50-200%
Use when: You need complete execution details
Trace size: Large

Common Use Cases

Use Case 1: Intermittent Bug Reproduction

User: "I have a bug that only happens sometimes. Help me capture what's happening."
→ Instrument with full tracing
→ Run multiple times until bug occurs
→ Analyze the trace from the failing run
→ Generate replay script to reproduce consistently

Use Case 2: Understanding Complex Control Flow

User: "I don't understand why this function returns the wrong value."
→ Instrument with functions + variables
→ Run with problematic input
→ Review trace to see variable values at each step
→ Identify where the logic goes wrong

Use Case 3: Debugging Production Issues

User: "Users report a crash but I can't reproduce it locally."
→ Instrument production code (minimal level for performance)
→ Deploy and wait for crash
→ Retrieve trace.json from crashed instance
→ Generate replay script to reproduce locally

Use Case 4: Regression Testing

User: "I fixed a bug. How do I ensure it doesn't come back?"
→ Capture trace of the bug before fix
→ Generate replay script
→ Use replay script as regression test
→ Run after each code change

Trace Format

Traces are stored in JSON format with the following structure:

{
  "traces": [
    {
      "seq": 1,
      "timestamp": "2024-01-15T10:30:45.123",
      "type": "function_entry",
      "depth": 0,
      "data": {
        "function": "calculate_total",
        "arguments": {"price": 100, "tax_rate": 0.08}
      }
    },
    {
      "seq": 2,
      "timestamp": "2024-01-15T10:30:45.125",
      "type": "variable_assignment",
      "depth": 1,
      "data": {
        "variable": "tax",
        "value": 8.0,
        "type": "float"
      }
    }
  ],
  "metadata": {
    "total_events": 2,
    "max_depth": 1
  }
}

Best Practices

Start Minimal: Begin with function-level tracing, add detail as needed
Focus on Bug Area: Use --exclude to skip irrelevant code paths
Test Instrumentation: Verify instrumented code behaves the same as original
Manage Trace Size: Use filtering to keep traces manageable
Validate Replay: Ensure replay script reproduces the bug consistently
Clean Up: Remove instrumentation before committing code

Limitations

Observer Effect: Instrumentation may change timing and behavior
- Minimize by using lower instrumentation levels
- Be aware of race conditions in concurrent code
Performance Overhead: Instrumented code runs slower
- Use sampling or selective instrumentation for performance-critical code
Trace Size: Full traces can be very large
- Apply filtering and size limits
- Focus on specific code regions
Non-Determinism: Some bugs involve external factors
- Record external inputs (network, file system, time)
- Use deterministic mode in configuration
Language Support: Currently supports Python only
- See references/instrumentation_techniques.md for other languages

Advanced Topics

Custom Instrumentation

Modify scripts/python_instrumenter.py to add custom tracing:

Trace specific function arguments
Record custom metrics
Add conditional breakpoints
Integrate with logging frameworks

Multi-Process Tracing

For programs with multiple processes:

Instrument each process separately
Use process ID in trace filenames
Merge traces for analysis

Distributed System Tracing

For distributed systems:

Add correlation IDs to trace events
Synchronize timestamps across nodes
Use distributed tracing tools (Jaeger, Zipkin)

Resources

scripts/python_instrumenter.py

AST-based Python code instrumenter that:

Parses Python source code
Inserts tracing calls at key points
Generates instrumented code with embedded trace runtime
Supports configurable instrumentation levels

scripts/replay_generator.py

Trace replay script generator that:

Reads execution traces from JSON
Generates executable Python replay scripts
Provides trace summaries and statistics
Enables deterministic bug reproduction

references/instrumentation_techniques.md

Comprehensive guide covering:

Instrumentation approaches (source, bytecode, dynamic)
What to trace and how to filter
Trace reduction strategies
Deterministic replay techniques
Language-specific considerations
Performance optimization
Best practices and common pitfalls

Read this reference when you need deeper understanding of instrumentation theory, want to implement instrumenters for other languages, or need to optimize trace performance.

assets/trace_config_template.json

Configuration template for customizing:

Instrumentation levels
Filtering rules
Performance settings
Replay options

Copy and modify this template to create custom trace configurations for specific use cases.

reproduction-trace-instrumenter