log-analysis
Log Analysis Skill
Overview
This skill provides techniques for effective log parsing, analysis, and insight extraction across infrastructure platforms. It covers log formats, parsing strategies, pattern recognition, and analysis methodologies.
Log Analysis Fundamentals
Log Anatomy
Every log entry typically contains:
[TIMESTAMP] [LEVEL] [SOURCE] [MESSAGE] [CONTEXT]
Key Fields:
- Timestamp: When the event occurred (critical for correlation)
- Level: Severity (DEBUG, INFO, WARN, ERROR, FATAL)
- Source: Component that generated the log
- Message: Human-readable description
- Context: Additional metadata (request ID, user ID, etc.)
Log Levels
| Level | Use | Action Required |
|---|---|---|
| FATAL | System cannot continue | Immediate |
| ERROR | Operation failed | Investigate |
| WARN | Potential problem | Monitor |
| INFO | Normal operation | None (audit) |
| DEBUG | Diagnostic detail | None (troubleshoot) |
Parsing Strategies
Structured Logs (JSON)
Most modern systems emit JSON logs:
{
"timestamp": "2024-01-15T14:30:00.123Z",
"level": "error",
"message": "Database connection failed",
"service": "api",
"request_id": "req-abc123",
"error": {
"code": "CONN_TIMEOUT",
"detail": "Connection timed out after 30000ms"
}
}
Parsing approach:
- Parse JSON structure
- Extract standard fields
- Flatten nested objects for analysis
- Group by common attributes
Unstructured Logs (Plain Text)
Legacy systems often use plain text:
2024-01-15 14:30:00 ERROR [api.handler] Database connection failed: timeout after 30s
Parsing approach:
- Identify timestamp format with regex
- Extract level using keyword matching
- Parse source from brackets/prefixes
- Remainder is message
Mixed Format Logs
Some systems mix formats:
[14:30:00] INFO: Starting request processing {"request_id": "abc123"}
Parsing approach:
- Split structured from unstructured portions
- Parse each portion with appropriate strategy
- Merge results
Analysis Techniques
Time-Based Analysis
Windowing: Group events by time period
Window: 1 minute
14:30 - 14:31: 5 errors
14:31 - 14:32: 12 errors ← Spike detected
14:32 - 14:33: 3 errors
Correlation: Match events across systems by timestamp
14:30:01.123 [API] Request received
14:30:01.125 [Auth] Token validated
14:30:01.130 [Database] Query started
14:30:01.145 [Database] Query completed
14:30:01.147 [API] Response sent
Pattern Recognition
Error Clustering: Group similar errors
Pattern: "Connection refused to {host}:{port}"
Instances:
- Connection refused to db-1:5432 (15 times)
- Connection refused to db-2:5432 (3 times)
Anomaly Detection: Identify unusual patterns
Normal: 10-20 requests/second
Current: 500 requests/second ← Anomaly
Frequency Analysis
Count by category:
| Error Type | Count | % of Total |
|---|---|---|
| Connection timeout | 45 | 60% |
| Auth failure | 20 | 27% |
| Validation error | 10 | 13% |
Trend analysis:
Hour 1: 10 errors
Hour 2: 15 errors
Hour 3: 25 errors ← Trending up
Hour 4: 50 errors ← Accelerating
Root Cause Indicators
First occurrence: Often indicates trigger
First error: 14:30:01 - "Failed to connect to new endpoint"
Subsequent: 14:30:02+ - "Connection pool exhausted"
Cascade patterns: Later errors caused by earlier ones
14:30:01 [DB] Connection failed
14:30:02 [API] Database unavailable
14:30:02 [API] Database unavailable
14:30:03 [API] Database unavailable
↑ Cascade from initial DB failure
Log Retrieval Commands
Supabase
# Available services
api, postgres, auth, storage, realtime, edge-function
# MCP command
mcp__plugin_supabase_supabase__get_logs(project_id, service)
GitHub Actions
# List runs
gh run list --limit 20
# Get logs
gh run view <run-id> --log
gh run view <run-id> --log-failed
Railway
# Recent logs
railway logs
# Follow live
railway logs --follow
Output Formatting
Summary Format
## Log Analysis Summary
**Time Range**: {START} to {END}
**Total Entries**: {COUNT}
**Error Rate**: {PCT}%
### By Level
| Level | Count | % |
|-------|-------|---|
| ERROR | 50 | 5% |
| WARN | 100 | 10% |
| INFO | 850 | 85% |
### Top Errors
1. {Error 1} - {count} occurrences
2. {Error 2} - {count} occurrences
### Timeline
{Key events in chronological order}
### Recommendations
{Based on patterns found}
Detailed Format
For specific error investigation:
## Error Details: {ERROR_TYPE}
**First Seen**: {TIMESTAMP}
**Last Seen**: {TIMESTAMP}
**Occurrences**: {COUNT}
### Sample Entry
{Full log entry}
### Context
{Surrounding log entries}
### Pattern
{What triggers this error}
### Impact
{What this error affects}
See patterns.md for platform-specific log patterns.
More from doubleslashse/claude-marketplace
requirements-clarification
Requirements clarification for TDD. Use BEFORE RED phase to understand WHAT to test. Asks targeted questions to uncover ambiguities, edge cases, and acceptance criteria.
26srs-documentation
Software Requirements Specification documentation following IEEE 830 standard. Use when generating formal SRS documents or compiling gathered requirements into structured documentation.
17brainstorming
Brainstorming techniques for idea generation. Use when facilitating brainstorming sessions, leading ideation exercises, or helping teams generate creative solutions.
14clean-code
Clean code principles including DRY, KISS, and YAGNI for .NET. Use when writing or reviewing code to ensure maintainability and simplicity.
11design-thinking
Design thinking methodology for human-centered problem solving. Use when facilitating design thinking workshops, user research sessions, or creative problem-solving activities.
10codebase-analysis
Techniques for analyzing existing codebases to reverse-engineer requirements and understand business logic. Use when conducting brownfield analysis or understanding existing system capabilities.
10