sc-log-fix
SKILL.md
Log-Fix: Iterative Bug Fixing from Log Analysis
Analyze logs from multiple sources, identify errors, perform root cause analysis, and interactively fix issues with pattern learning.
Quick Start
# Analyze recent errors from all log sources
/sc:log-fix
# Filter by time and severity
/sc:log-fix --time 1h --level ERROR
# Target specific log source
/sc:log-fix --source backend --component app.services
# Trace a specific request
/sc:log-fix --request-id abc-123-def
# Dry run - analyze without fixing
/sc:log-fix --dry-run
Behavioral Flow
- Discover - Detect available log sources (files, docker, systemd)
- Load Knowledge - Check for learned patterns from previous fixes
- Collect - Parse and normalize logs from detected sources
- Triage - Cluster errors, rank by severity, match known patterns
- Analyze - Root cause analysis with stack traces and code inspection
- Fix - Interactive fix loop with user approval
- Learn - Save successful fix patterns to knowledge base
- Report - Summary of session with remaining issues
Flags
| Flag | Type | Default | Description |
|---|---|---|---|
--time |
string | 1h | Time range: 30m, 1h, 1d, 7d |
--source |
string | all | Log source: backend, frontend, docker, all |
--level |
string | ERROR | Minimum severity: ERROR, WARNING, all |
--component |
string | - | Filter by logger/module name |
--request-id |
string | - | Trace specific request across logs |
--dry-run |
bool | false | Analyze and report without applying fixes |
--fix |
bool | true | Enable interactive fix loop |
Phase 0: Log Source Discovery
Auto-detect all available log sources before analysis.
Detection Strategy
| Source | Detection Method | Common Locations |
|---|---|---|
| File logs | Glob for logs/**/*.log, *.log |
logs/, var/log/ |
| Docker | Check docker compose ps |
docker-compose.yml |
| Systemd | Check journalctl availability |
System services |
| PM2 | Check pm2 list |
Node.js apps |
Log Format Detection
Auto-detect log format:
- JSON (structured) - Parse with
jq - Text (unstructured) - Parse with regex patterns
- Combined (nginx/apache style) - Parse with format-specific regex
# Detect format by reading first line
head -1 logs/app.log | python3 -c "import sys,json; json.loads(sys.stdin.read()); print('json')" 2>/dev/null || echo "text"
Phase 1: Knowledge Loading
Load historical fix knowledge for enhanced analysis.
Knowledge File Structure
{
"version": "1.0",
"error_patterns": [
{
"id": "pattern-001",
"signature": {
"message_pattern": "Connection refused.*port \\d+",
"component": "app.services.database",
"exception_type": "ConnectionError"
},
"fixes": [{
"description": "Check database is running, verify connection string",
"file_changed": "src/config.py",
"validation_command": "pytest tests/test_db.py -v",
"success_count": 3,
"confidence": 0.9
}]
}
],
"component_knowledge": {},
"metadata": {
"total_fixes": 0,
"updated_at": null
}
}
Location: .claude/knowledge/log-fix-knowledge.json
When knowledge is loaded:
- Triage: Show "Known issue" badge for previously seen errors
- Root Cause: Suggest fixes with historical success rates
- Fix Loop: Offer one-click application of proven fixes
Phase 2: Triage and Clustering
Incident Clustering
Group by:
- Same error type (identical/similar messages)
- Same component (logger/service name)
- Same request (request_id correlation)
- Same time window (errors within 5s)
Severity Ranking
| Priority | Criteria |
|---|---|
| P0 | Stack trace, service crash, database errors |
| P1 | ERROR level, API failures, auth issues |
| P2 | WARNING level, deprecation, performance |
| P3 | Informational, cleanup suggestions |
Triage Summary Output
## Error Triage Summary
**Time Range**: [start] to [end]
**Total Errors**: N | **Unique Types**: M | **Known Patterns**: K
### Top Issues by Frequency
| # | Count | Component | Error Pattern | Priority | Historical |
|---|-------|-----------|---------------|----------|------------|
| 1 | 15 | app.services.api | Connection timeout | P1 | Known: 3 fixes, 90% |
| 2 | 8 | app.models | Validation failed | P1 | New pattern |
Phase 3: Root Cause Analysis
For each incident cluster:
- Stack Trace Analysis - Extract file paths, identify failing line, map to components
- Log Correlation - Follow request_id across services
- Pattern Matching - Check knowledge base, then use code-level analysis
- Code Inspection - Read relevant source files, identify the bug
Root Cause Output
## Root Cause: [Error Type]
**Occurrences**: N | **Component**: [logger]
### Error Details
[Full error and stack trace]
### Relevant Code
**File**: `src/services/api.py:145`
[Code context with highlighted line]
### Probable Cause
[Analysis explaining why the error occurs]
### Suggested Fix
[Specific code change with diff preview]
Phase 4: Interactive Fix Loop
Fix Workflow
- Present - Show error, root cause, affected files
- Show diff - Display exact code changes proposed
- Ask permission - User approves, skips, or modifies
- Apply if approved - Use Edit tool
- Validate - Run relevant tests
- Learn - Offer to save pattern to knowledge base
User Options
| Action | Description |
|---|---|
| Apply | Apply the proposed fix |
| Skip | Skip this issue, move to next |
| Edit | Modify the proposed fix before applying |
| Test first | Write a test before fixing (invoke /sc:tdd) |
| Details | Show more context about the error |
| Quit | Exit fix loop |
Validation After Fix
| Fix Type | Validation |
|---|---|
| Python code | pytest -k "test_function" -v |
| JavaScript | npm test -- --testPathPattern=file |
| Config | Restart and verify health |
| Database | Run migrations |
Phase 5: Knowledge Learning
After a successful fix, offer to save the pattern:
- Extract error signature - Convert specific values to regex patterns
- Record fix details - What changed, where, validation command
- Update knowledge file - Append or update existing pattern
- Report confidence - Initial confidence 1.0, adjusted over time
Learning Prompt
Fix applied and validated!
Save this pattern for future similar errors?
[Y] Yes, save pattern and fix
[n] No, this was a one-off fix
[e] Yes, but let me edit the description first
Phase 6: Session Report
## Log-Fix Session Summary
**Logs Analyzed**: [sources] | **Time Range**: [range]
**Errors Found**: N | **Unique Issues**: M
### Issues Addressed
| # | Component | Issue | Status | Action |
|---|-----------|-------|--------|--------|
| 1 | app.services | Connection timeout | FIXED | Added retry config |
| 2 | app.models | Validation error | SKIPPED | User deferred |
### Files Modified
- src/services/api.py
- src/config.py
### Patterns Learned
- 1 new pattern saved to knowledge base
### Next Steps
- Run `/sc:pr-check` to validate all changes
- Consider `/sc:tdd` for issues without test coverage
MCP Integration
PAL MCP (Debugging & Analysis)
| Tool | When to Use | Purpose |
|---|---|---|
mcp__pal__debug |
Complex bugs | Multi-stage root cause analysis |
mcp__pal__thinkdeep |
Unclear patterns | Deep investigation of recurring issues |
mcp__pal__codereview |
Fix validation | Review proposed fix quality |
mcp__pal__apilookup |
Dependency errors | Get current docs for version issues |
PAL Usage Patterns
# Debug complex error pattern
mcp__pal__debug(
step="Investigating recurring connection timeout in API layer",
hypothesis="Connection pool exhaustion under load",
confidence="medium",
relevant_files=["src/services/api.py", "src/config.py"]
)
# Deep analysis of unclear pattern
mcp__pal__thinkdeep(
step="Why do these errors only occur during peak hours?",
hypothesis="Race condition in connection pooling",
confidence="low"
)
Rube MCP (Notifications)
| Tool | When to Use | Purpose |
|---|---|---|
mcp__rube__RUBE_SEARCH_TOOLS |
External logging | Find logging service tools |
mcp__rube__RUBE_MULTI_EXECUTE_TOOL |
Notifications | Post fix summary to Slack/Jira |
Tool Coordination
- Bash - Log parsing (jq, grep), test execution, docker commands
- Glob - Log file discovery
- Grep - Error pattern search, request ID tracing
- Read - Source code inspection, log file reading
- Edit - Apply code fixes
- Write - Knowledge base updates, session reports
Related Skills
/sc:tdd- Write tests before fixing (TDD approach)/sc:pr-check- Validate all changes before PR/sc:analyze- Deeper code analysis when root cause is unclear
Weekly Installs
1
Repository
tony363/superclaudeGitHub Stars
16
First Seen
6 days ago
Security Audits
Installed on
amp1
cline1
openclaw1
trae-cn1
opencode1
cursor1