Debug

Systematic debugging for errors, test failures, unexpected behavior, and production issues.

Usage

/debug [issue] [--logs] [--correlate] [--trace] [--type bug|build|perf|deploy]

Options

Flag	Purpose
`--logs`	Enable log pattern analysis (error spikes, frequency, types)
`--correlate`	Run SQL correlation queries on structured logs
`--trace`	Deep stack trace analysis with context
`--type`	Issue category: bug, build, perf(ormance), deploy(ment)

When to Use

Error messages or stack traces appear
Tests are failing
Code behaves unexpectedly
User says "it's broken" or "not working"
Production errors need investigation (--logs)
Need to correlate errors across systems (--correlate)
Deep stack analysis required (--trace)

Debugging Process

Capture - Get error message, stack trace, and reproduction steps
Isolate - Narrow down the failure location
Hypothesize - Form theories about the cause
Test - Validate hypotheses with evidence
Fix - Implement minimal fix
Verify - Confirm solution works

Investigation Steps

# Check recent changes that might have caused the issue
git log --oneline -10
git diff HEAD~3

# Find error patterns in logs
grep -r "error\|Error\|ERROR" logs/ 2>/dev/null | tail -20

# Check test output
npm test 2>&1 | tail -50  # or pytest, cargo test, etc.

Log Analysis (`--logs`)

Find Errors

# Recent errors with context
grep -B 5 -A 10 "ERROR" /var/log/app.log

# Count by error type
grep -oE "Error: [^:]*" app.log | sort | uniq -c | sort -rn

# Errors in time range
awk '/2024-01-15 14:/ && /ERROR/' app.log

# Find repeated errors
grep "ERROR" app.log | cut -d']' -f2 | sort | uniq -c | sort -rn | head -20

# Find error spikes
grep "ERROR" app.log | cut -d' ' -f1-2 | uniq -c | sort -rn

Common Patterns

Pattern	Indicates	Action
NullPointer	Missing null check	Add validation
Timeout	Slow dependency	Add timeout, retry
Connection refused	Service down	Check health, retry
OOM	Memory leak	Profile, increase limits
Rate limit	Too many requests	Add backoff, queue

Correlation Queries (`--correlate`)

-- Errors by endpoint
SELECT endpoint, count(*) as errors
FROM logs
WHERE level = 'ERROR' AND time > NOW() - INTERVAL '1 hour'
GROUP BY endpoint ORDER BY errors DESC;

-- Error rate over time
SELECT
  date_trunc('minute', time) as minute,
  count(*) filter (where level = 'ERROR') as errors,
  count(*) as total
FROM logs
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY minute ORDER BY minute;

-- Correlate request IDs across services
SELECT service, message, time
FROM logs
WHERE request_id = 'req-12345'
ORDER BY time;

Stack Trace Analysis (`--trace`)

Parse Stack Traces

import re

def parse_stack_trace(log_content: str) -> list[dict]:
    pattern = r'(?P<exception>\w+Error|\w+Exception): (?P<message>.*?)\n(?P<trace>(?:\s+at .+\n)+)'
    traces = []
    for match in re.finditer(pattern, log_content):
        traces.append({
            'type': match.group('exception'),
            'message': match.group('message'),
            'trace': match.group('trace').strip().split('\n')
        })
    return traces

Investigation Checklist

Capture - Get full error message and stack trace
Timestamp - When did it start?
Frequency - How often? Increasing?
Scope - All users or specific?
Changes - Recent deployments?
Dependencies - External services affected?

Output Format

## Debug Report

**Issue:** [Brief description]
**Root Cause:** [What's actually wrong]

### Evidence

- [Finding 1]
- [Finding 2]

### Fix

[Code or configuration change]

### Verification

[How to confirm the fix works]

### Prevention

[How to prevent this in the future]

Examples

Input: "TypeError: Cannot read property 'map' of undefined" Action: Trace the undefined value, find where data should be initialized, fix the source

Input: "Tests are failing" Action: Run tests, capture failures, analyze each failure, fix underlying issues

Input: /debug --logs "API returning 500 errors" Action: Search logs for 500 status, find stack traces, identify root cause, check error frequency

Input: /debug --correlate "intermittent failures" Action: Run correlation queries to find patterns, identify affected endpoints, correlate with events

debug