ln-514-test-log-analyzer
Paths: File paths (
shared/,references/,../ln-*) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.
Test Log Analyzer
Two-layer analysis of application logs. Python script handles collection and quantitative analysis; AI handles classification, quality assessment, and fix recommendations.
Inputs
No required inputs. Runs in current project directory, auto-detects log sources.
Optional args — caller instructions (natural language): time window, expected errors, test context. Example: "review logs for last 30min, auth 401 errors expected from negative tests".
Purpose & Scope
- Analyze application logs (after test runs, during development, or on demand)
- Classify errors into 4 categories: Real Bug, Test Artifact, Expected Behavior, Operational Warning
- Assess log quality: noisiness, completeness, level correctness, format, structured logging
- Map stack traces to source files; provide fix recommendations
- Report findings for quality verdict (only Real Bugs block)
- No status changes or task creation — report only
When to Use
- Analyze application logs in any project (default: last 1h)
- After test runs to classify errors and assess log quality
- Can be invoked with context instructions:
Skill(skill: "ln-514-test-log-analyzer", args: "review last 30min, 401 errors expected")
Workflow
Phase 0: Parse Instructions
If args provided — extract: time window (default: 1h), expected errors list, test context.
If no args — use defaults (last 1h, no expected errors).
Phase 1: Log Source Detection and Script Execution
MANDATORY READ: Load docs/project/infrastructure.md, docs/project/runbook.md
- Check if
scripts/analyze_test_logs.pyexists in target project. If missing, copy fromreferences/analyze_test_logs.py. - Detect log source mode (auto-detection priority: docker → file → loki):
| Mode | Detection | Source |
|---|---|---|
docker |
docker compose ps returns running containers |
docker compose logs --since {window} |
file |
.log files exist, or tests/manual/results/ has output |
File paths from infrastructure.md or *.log glob |
loki |
LOKI_URL env var or tools_config.md observability section |
Loki HTTP query_range API |
- Run script:
python scripts/analyze_test_logs.py --mode {detected} [options] - If no log sources found → return
NO_LOG_SOURCESstatus, skip to Phase 5.
Phase 2: 4-Category Error Classification
Classify each error group from script JSON output:
| Category | Action | Criteria |
|---|---|---|
| Real Bug | Fix | Unexpected crash, data loss, broken pipeline |
| Test Artifact | Skip | From test scripts, deliberate error-path validation |
| Expected Behavior | Skip | Rate limiting, input validation, auth failures from invalid tokens |
| Operational Warning | Monitor | Clock drift, resource pressure, temporary unavailability |
Test artifact detection heuristics:
- Test name contains:
invalid,error,fail,reject,unauthorized,forbidden,not_found,bad_request,timeout - Test asserts non-2xx status codes (4xx, 5xx)
- Test uses
pytest.raises,expect(...).rejects,assertThrows,should.throw - Errors correlate with test execution timestamps from regression test output
- Patterns matching
tests/manual/scripts
Error taxonomy per references/error_taxonomy.md (9 categories: CRASH, TIMEOUT, AUTH, DB, NETWORK, VALIDATION, CONFIG, RESOURCE, DEPRECATION).
Phase 3: Log Quality Assessment
MANDATORY READ: Load references/error_taxonomy.md (per-level criteria table + level correctness reference)
Step 1: Detect configured log level. Check in order:
LOG_LEVEL/LOGLEVELenv var (.env,docker-compose.yml,infrastructure.md)- Framework config: Python
logging.conf/ DjangoLOGGING/ NodeLOG_LEVEL - Default: assume
INFOif not detected
Configured level determines WHICH levels appear in logs, but each level has its own noise threshold regardless.
Step 2: Assess 6 quality dimensions:
| Dimension | What to Check | Signal |
|---|---|---|
| Noisiness | Per-level noise thresholds from error_taxonomy.md section 4: TRACE (zero in prod), DEBUG (>50% monopoly), INFO (>30%), WARNING (>1% of total), ERROR (>0.1% of total) |
NOISY: {level} template "{msg}" at {ratio}% |
| Completeness & Traceability | Critical operations missing log entries + traceability gaps (see table below) | MISSING: No log for {operation} / TRACEABILITY_GAP: {type} in {file}:{line} |
| Level correctness | Per-level criteria from error_taxonomy.md section 4: content, anti-patterns, library rule |
WRONG_LEVEL: should be {level} |
| Structured logging | Missing trace_id/request_id/user context; unstructured plaintext | UNSTRUCTURED: lacks {field} |
| Sensitivity | PII/secrets/tokens/passwords in log messages | SENSITIVE: {type} exposure |
| Context richness | Errors without actionable context (order_id, user_id, operation) | LOW_CONTEXT: lacks context |
Traceability gap detection — scan source code for operations without INFO-level logging:
| Operation Type | Expected Log | Where to Add |
|---|---|---|
| Incoming request handling | Request received + response status | Entry/exit of route handler |
| External API call | Request sent + response status + duration | Before/after HTTP client call |
| DB write (INSERT/UPDATE/DELETE) | Operation + affected entity + count | Before/after ORM/query call |
| Auth decision | Result (allow/deny) + reason | After auth check |
| State transition | Old state → new state + trigger | At transition point |
| Background job | Start + complete/fail + duration | Entry/exit of job handler |
| File/resource operation | Open/close + path + size | At I/O operation |
Log Format Quality (10-criterion checklist per references/log_analysis_output_format.md):
| # | Criterion | Check |
|---|---|---|
| 1 | Dual format | JSON in prod, readable in dev |
| 2 | Timestamp | Consistent, timezone-aware |
| 3 | Level field | Present, uppercase |
| 4 | Trace/Correlation ID | Present in every entry, async-safe |
| 5 | Service name | Identifies source service |
| 6 | Source location | module:line + function |
| 7 | Extra context | Structured fields, not string interpolation |
| 8 | PII redaction | Passwords, API keys, emails handled |
| 9 | Noise suppression | Duplicate filters, third-party suppressed |
| 10 | Parseability | Dev: pipe-delimited; prod: valid JSON per line |
Score: passed criteria / 10.
Phase 4: Stack Trace Mapping + Fix Recommendations
For each Real Bug:
- Extract stack trace frames; identify origin frame (first frame in project code, not in node_modules/site-packages)
- Map to source file:line
- Generate fix recommendation: what to change, where, effort estimate (S/M/L)
Prioritize using Sentry-inspired dimensions:
- High-volume (occurrence count), Post-test regression (new errors), High-impact path (auth/payment/DB), Correlated traces (trace_id across services)
Phase 5: Generate Report
MANDATORY READ: Load references/log_analysis_output_format.md
Output report to chat with header ## Test Log Analysis. Include:
- Signals table (Real Bugs count, Test Artifacts filtered, Log Noise status, Log Format score, Log Quality score)
- Real Bugs table (priority, category, error, source, fix recommendation)
- Filtered table (category, count, examples)
- Log Quality Issues table (dimension, service, issue, recommendation)
- Noise Report table (count, ratio, service, level, template, action)
- Machine-readable block
<!-- LOG-ANALYSIS-DATA ... -->for programmatic consumption
Phase 6: Meta-Analysis
MANDATORY READ: Load shared/references/meta_analysis_protocol.md
Skill type: execution-worker. Run after all phases complete.
Verdict Contribution
Quality coordinator normalization matrix component:
| Status | Maps To | Penalty |
|---|---|---|
| CLEAN | -- | 0 |
| WARNINGS_ONLY | -- | 0 |
| REAL_BUGS_FOUND | FAIL | -20 |
| SKIPPED / NO_LOG_SOURCES | ignored | 0 |
Log quality/format issues are INFORMATIONAL — do not affect quality verdict. Only Real Bugs block.
Critical Rules
- No status changes or task creation; report only.
- Test Artifacts and Expected Behavior are ALWAYS filtered — never count as bugs.
- Log quality issues are advisory — inform, don't block.
- Script must handle gracefully: no Docker, no log files, no Loki →
NO_LOG_SOURCES. - Language preservation in comments (EN/RU).
Definition of Done
- Script deployed to target project
scripts/(or already exists). - Log source detected and script executed (or NO_LOG_SOURCES returned).
- Errors classified into 4 categories; Real Bugs identified.
- Log quality assessed (6 dimensions + 10-criterion format checklist).
- Stack traces mapped to source files for Real Bugs.
- Report output to chat with signals table + machine-readable block.
Reference Files
- Error taxonomy:
references/error_taxonomy.md - Output format:
references/log_analysis_output_format.md - Analysis script:
references/analyze_test_logs.py
Version: 1.0.0 Last Updated: 2026-03-13