grafana
Grafana
Dashboard JSON Gotchas
Required Fields
{
"dashboard": {
"id": null, // null for new, existing ID for update
"uid": "unique-id", // Stable identifier for API/links
"title": "Name",
"panels": []
},
"overwrite": true // Required at root level for updates
}
Panel Positioning
gridPos uses 24-column grid. Panels auto-stack if positions overlap.
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
Data Source Reference
Always use UID, not name (names can change):
"datasource": {"type": "prometheus", "uid": "prometheus-uid"}
Template Variable Syntax
| Context | Syntax | Multi-value behavior |
|---|---|---|
| PromQL label | {ns=~"$var"} |
Pipe-joined: ns=~"a|b|c" |
| SQL | '$var' or ${var:csv} |
Depends on format |
| Lucene | $var |
Space-joined |
Variable Query Refresh
0: Never (on dashboard load only)1: On dashboard load2: On time range change (use this for most cases)
Alerting (Grafana 9+)
Alert Rule Data Array
"data": [
{"refId": "A", "model": {"expr": "..."}}, // Query
{"refId": "B", "reducer": "last", "expression": "A"}, // Reduce
{"refId": "C", "type": "threshold", "expression": "B", "conditions": [...]} // Condition (must be last)
]
The condition field in the rule must reference the final refId (here "C").
No Data / Error States
NoData: Query returns emptyAlerting: Treat no data as firingOK: Treat no data as resolvedError: Evaluation failed
Common Panel Configs
Thresholds
"thresholds": {
"mode": "absolute", // or "percentage"
"steps": [
{"color": "green", "value": null}, // null = base
{"color": "yellow", "value": 70},
{"color": "red", "value": 90}
]
}
Value Mappings
"mappings": [
{"type": "value", "options": {"0": {"text": "Down", "color": "red"}}},
{"type": "range", "options": {"from": 1, "to": 100, "result": {"text": "OK"}}}
]
Units
percentunit: 0-1 displayed as 0%-100%percent: Already 0-100, displayed as-isbytesvsdecbytes: Binary (1024) vs decimal (1000)s,ms,µs,ns: Auto-scales appropriately
More from kontrolplane/skills
kyverno
Kyverno Kubernetes policy engine for validation, mutation, and generation. Use when writing ClusterPolicies to enforce security standards, auto-mutate resources with defaults, generate companion resources, or verify container image signatures.
12prometheus
Prometheus metrics and PromQL queries. Use when writing PromQL queries, creating recording or alerting rules, debugging metric scraping issues, or understanding counter/gauge/histogram behavior.
4loki
Grafana Loki log aggregation and LogQL queries. Use when writing LogQL queries for log analysis, configuring Promtail scrape pipelines, debugging log ingestion issues, or creating Loki alerting rules.
3argocd
ArgoCD GitOps continuous delivery for Kubernetes. Use when creating or debugging ArgoCD Application/ApplicationSet manifests, configuring sync policies, troubleshooting OutOfSync or degraded states, or integrating Helm/Kustomize sources.
3kubernetes
Kubernetes resource configuration and troubleshooting. Use when debugging pod failures, configuring probes and resource limits, setting up RBAC or NetworkPolicies, or resolving common Kubernetes errors like CrashLoopBackOff or ImagePullBackOff.
3terraform
Terraform infrastructure as code with HCL. Use when writing Terraform configurations, debugging state issues, understanding count vs for_each behavior, managing modules, or troubleshooting plan/apply errors.
3