dt-obs-logs
Installation
SKILL.md
Log Analysis Skill
Query, filter, and analyze Dynatrace log data using DQL for troubleshooting and monitoring.
What This Skill Covers
- Fetching and filtering logs by severity, content, and entity
- Searching log messages using pattern matching
- Calculating error rates and statistics
- Analyzing log patterns and trends
- Grouping and aggregating log data by dimensions
When to Use This Skill
Use this skill when users want to:
- Find specific log entries (e.g., "show me error logs from the last hour")
- Filter logs by severity, process group, or content
- Search logs for specific keywords or phrases
- Calculate error rates or log statistics
- Identify common error messages or patterns
- Analyze log trends over time
- Troubleshoot issues using log data
Key Concepts
Log Data Model
- timestamp: When the log entry was created
- content: The log message text
- status: Log level (ERROR, FATAL, WARN, INFO, etc.)
- dt.process_group.id: Associated process group entity
- dt.process_group.detected_name: Resolves process group IDs to human-readable names
Query Patterns
- fetch logs: Primary command for log data access
- Time ranges: Use
from:now() - <duration>for time windows - Filtering: Apply severity, content, and entity filters
- Aggregation: Group and summarize log data
- Pattern Detection: Use
matchesPhrase()andcontains()for content search
Common Operations
- Severity filtering (single or multiple levels)
- Content search (simple and full-text)
- Entity-based filtering (process groups)
- Time-series analysis (bucketing, sorting)
- Error rate calculation
- Pattern analysis (exceptions, timeouts, etc.)
Core Workflows
1. Log Searching
Find specific log entries by time, severity, and content.
Typical steps:
- Define time range
- Filter by severity (optional)
- Search content for keywords
- Select relevant fields
- Sort and limit results
Example:
fetch logs, from:now() - 1h
| filter status == "ERROR"
| fields timestamp, content, process_group = dt.process_group.detected_name
| sort timestamp desc
| limit 100
2. Log Filtering
Narrow down logs using multiple criteria (severity, entity, content).
Typical steps:
- Fetch logs with time range
- Apply severity filters
- Filter by entity (process_group)
- Apply content filters
- Format and sort output
Example:
fetch logs, from:now() - 2h
| filter in(status, {"ERROR", "FATAL", "WARN"})
| summarize count(), by: {dt.process_group.id, dt.process_group.detected_name}
| fieldsAdd process_group = dt.process_group.detected_name
| sort `count()` desc
3. Pattern Analysis
Identify patterns, trends, and anomalies in log data.
Typical steps:
- Fetch logs with time range
- Add pattern detection fields
- Aggregate by entity or time
- Calculate statistics and ratios
- Sort by frequency or rate
Example:
fetch logs, from:now() - 2h
| filter status == "ERROR"
| fieldsAdd
has_exception = if(matchesPhrase(content, "exception"), true, else: false),
has_timeout = if(matchesPhrase(content, "timeout"), true, else: false)
| summarize
count(),
exception_count = countIf(has_exception == true),
timeout_count = countIf(has_timeout == true),
by: {process_group = dt.process_group.detected_name}
Key Functions
Filtering
filter status == "ERROR"- Filter by status levelin(status, "ERROR", "FATAL", "WARN")- Multi-status filtercontains(content, "keyword")- Simple substring searchmatchesPhrase(content, "exact phrase")- Full-text phrase search
Entity Operations
dt.process_group.detected_name- Get human-readable process group namefilter process_group == "service-name"- Filter by specific entity
Aggregation
count()- Count all log entriescountIf(condition)- Conditional countby: {dimension}- Group by entity or time bucketbin(timestamp, 5m)- Time bucketing for trends
Field Operations
fields timestamp, content, status- Select specific fieldsfieldsAdd name = expression- Add computed fieldsif(condition, true_value, else: false_value)- Conditional logic
Common Patterns
Content Search
Simple substring search:
fetch logs, from:now() - 1h
| filter contains(content, "database")
| fields timestamp, content, status
Full-text phrase search:
fetch logs, from:now() - 1h
| filter matchesPhrase(content, "connection timeout")
| fields timestamp, content, process_group = dt.process_group.detected_name
Error Rate Calculation
Calculate error rates over time:
fetch logs, from:now() - 2h
| summarize
total_logs = count(),
error_logs = countIf(status == "ERROR"),
by: {time_bucket = bin(timestamp, 5m)}
| fieldsAdd error_rate = (error_logs * 100.0) / total_logs
| sort time_bucket asc
Top Error Messages
Find most common errors:
fetch logs, from:now() - 24h
| filter status == "ERROR"
| summarize error_count = count(), by: {content}
| sort error_count desc
| limit 20
Process Group-Specific Logs
Filter logs by process group:
fetch logs, from:now() - 1h
| fieldsAdd process_group = dt.process_group.detected_name
| filter process_group == "payment-service"
| filter status == "ERROR"
| fields timestamp, content, status
| sort timestamp desc
Structured / JSON Log Parsing
Many applications emit JSON-formatted log lines. Use parse to extract fields instead of dumping raw content:
fetch logs, from:now() - 1h
| filter status == "ERROR"
| parse content, "JSON:log"
| fieldsAdd level = log[level], message = log[msg], error = log[error]
| fields timestamp, level, message, error
| sort timestamp desc
| limit 50
Aggregate by a parsed field:
fetch logs, from:now() - 4h
| filter status == "ERROR"
| parse content, "JSON:log"
| fieldsAdd message = log[msg]
| summarize error_count = count(), by: {message}
| sort error_count desc
| limit 20
Notes:
parse content, "JSON:log"creates a record fieldlog— access nested values withlog[key]- Filter logs with
contains()beforeparseto reduce parsing overhead - Works with any JSON-structured field, not just
content
Best Practices
- Always specify time ranges - Use
from:now() - <duration>to limit data - Apply filters early - Filter by severity and entity before aggregation
- Use appropriate search methods -
contains()for simple,matchesPhrase()for exact - Limit results - Add
| limit 100to prevent overwhelming output - Sort meaningfully - Sort by timestamp for recent logs, by count for top errors
- Name entities - Use
dt.process_group.detected_nameorgetNodeName()for human-readable output - Use time buckets for trends -
bin(timestamp, 5m)for time-series analysis
Integration Points
- Entity model: Uses
dt.process_group.idfor service correlation - Time series: Supports temporal analysis with
bin()and time ranges - Content search: Full-text search capabilities via
matchesPhrase() - Aggregation: Statistical analysis using
summarizeand conditional functions
Limitations & Notes
- Log availability depends on OneAgent configuration and log ingestion
- Full-text search (
matchesPhrase) may have performance implications on large datasets - Entity names require proper OneAgent monitoring for resolution
- Time ranges should be reasonable (avoid unbounded queries)
Related Skills
- dt-dql-essentials - Core DQL syntax and query structure for log queries
- dt-obs-tracing - Correlate logs with distributed traces using trace IDs
- dt-obs-problems - Correlate logs with DAVIS-detected problems
Weekly Installs
73
Repository
dynatrace/dynat…e-for-aiGitHub Stars
40
First Seen
8 days ago
Security Audits
Installed on
codex72
opencode72
kimi-cli71
gemini-cli71
deepagents71
antigravity71