cx-create-dashboard
Create Coralogix Dashboard
Produces a Coralogix dashboard for a target service and deploys it via the cx CLI. Workflow: discover the service's telemetry, align on intent with the user, draft a plan, emit the JSON, live-verify every query through cx, then create the dashboard in a chosen folder.
Only use metric names, log fields, and span attributes you can cite from the service's code, README, configuration, or a live query that returned a result. Do not invent them.
Reference files
Load these files for domain-specific guidance:
| Task | Reference |
|---|---|
| DataPrime query syntax | references/dataprime-reference.md |
| PromQL query syntax, counters vs gauges, histograms | references/promql-guidelines.md |
| Log field discovery, query patterns, wildfind policy | references/logs-querying.md |
| Span field discovery, latency analysis, trace queries | references/spans-querying.md |
Dashboard-specific query gotchas (${__range}, promqlQueryType) |
references/query-syntax.md |
| Widget JSON templates | references/widget-templates.md |
For choosing the right signal (metrics / logs / traces), use cx-telemetry-querying.
Dashboard Management
Beyond creating dashboards, use these commands to manage existing ones:
| Command | Purpose |
|---|---|
cx dashboards catalog -o json |
List all dashboards in the catalog |
cx dashboards get <id> -o json |
Get a dashboard definition (useful as a template) |
cx dashboards folders list -o json |
List dashboard folders |
cx dashboards folders create --name "Name" |
Create a dashboard folder |
cx dashboards folders create --name "Sub" --parent-id <id> |
Create a nested folder |
To duplicate or modify an existing dashboard:
cx dashboards get <dashboard-id> -o json > dashboard.json
# Edit dashboard.json (change name, modify widgets, etc.)
cx dashboards create --from-file dashboard.json
Workflow
Track progress through this checklist:
Dashboard Progress:
- [ ] Phase 1: Discover telemetry & business meaning
- [ ] Phase 2: Gather dashboard specifications from user
- [ ] Phase 3: Draft internal dashboard plan (sections/rows/widgets)
- [ ] Phase 4: Generate the Coralogix JSON
- [ ] Phase 5: Live-verify every query through the cx CLI
- [ ] Phase 6: Self-verify structure against the checklist
- [ ] Phase 7: Deploy via `cx dashboards create`
Proceed in order. Don't jump to Phase 4 before the user approves the Phase 3 plan, and don't run Phase 7 before Phases 5 and 6 both pass.
Phase 1: Discover telemetry & business meaning
For the target service, gather:
- Business purpose - read
README.mdand the top-level entrypoint (main.*,index.*,cmd/main.go, etc.). Summarize in 2–3 sentences what it does, its key stages, and what can go wrong. - Metrics - for each candidate keyword (service name, subsystem, verbs like
request,error,latency,dlq) runcx metrics search --name '*<keyword>*'. When a metric looks promising, list its labels withcx metrics get-labels <metric>. Only use namescx metrics searchreturns - this is what prevents invented metrics from reaching Phase 5. Cross-check the service's instrumentation (prometheus_client,promauto.NewCounter/Histogram/Gauge, OTel meters,prom-client, Micrometer,metrics.py) for semantics and histogram buckets (_sum,_count,_bucket). - Logs - discover custom
$d.*fields withcx search-fields "<description>" --dataset logsbefore assuming a field exists. Sample message templates and severity withcx logs "filter \$l.applicationname == '<app>'" --limit 5 -o json. Standard fields ($m.severity,$m.timestamp,$l.applicationname,$l.subsystemname) don't need discovery. - Spans / traces - discover span attributes with
cx search-fields "<description>" --dataset spans. Sample withcx spans "filter \$l.serviceName == '<svc>'" --limit 5 -o json. Error conventions vary ($d.tags.error,$d.http.status_code); check samples before filtering. - Message buses & DLQs - grep for Kafka, RabbitMQ, SQS, Pub/Sub clients and any
dlq/DLQreferences. Note topic/queue names for DLQ panels. - Service configuration - check
meta.yaml, Helmvalues.yaml,Deployment,Dockerfile,chart.yaml. Extract:- The
applicationname/subsystemnamelabel values as they appear in Coralogix. - Tenant/account/team identifiers used as metric or log labels.
- Deployment environments (
prod,staging,dev, …).
- The
If the signal for a question is ambiguous (e.g. "how much revenue last week"), delegate to cx-telemetry-querying first.
Produce a short internal summary before moving on. If critical telemetry is missing (e.g. no metrics), surface that to the user and ask whether they want a log-only or trace-only dashboard.
Phase 2: Gather dashboard specifications
Ask the user a focused set (≤6). Prefer AskQuestion:
- Audience & use - on-call triage, product/business tracking, capacity planning, customer success?
- Default time range - typical viewing window (e.g. 24h, 7d). Queries still use
${__range}so users can zoom. - Slicing dimensions - top-level filters (
tenant_id,account_id,subsystem_name,region,env, …). - Environment scope - which environments to include/exclude (common default: exclude
dev,staging,test). - SLO-ish signals - success-rate, latency, or throughput targets to highlight?
- Priorities - what to see first (drives row ordering and which section is
collapsed: true).
Don't block on answers you can reasonably infer - state the inference and continue.
Phase 3: Draft the internal plan
Write a markdown plan the user can approve before JSON generation:
## Dashboard: <Service> - <Purpose>
### Section 1: <Overview> (collapsed: false)
- Row 1: [widget type] <title> - <what it shows> - source: metrics|logs|spans
- Row 2: ...
### Section 2: <Deep dive> (collapsed: false)
...
### Section N: <Logs & errors> (collapsed: true)
...
### Top-level filters
- <label> (<source>)
### Assumptions / gaps
- ...
Section design:
- First section: at-a-glance health (gauges + key rates), always expanded.
- Pair related time-series in the same row (rate + latency).
- Final section (raw logs, rare breakdowns):
collapsed: true. - Aim for 3–5 sections, 6–20 widgets total.
Widget-type selection:
| Signal | Widget type |
|---|---|
| Single headline number (count, % success, totals) | gauge (Coralogix calls this "stat") |
| Breakdown across ≤8 categories | pieChart |
| Change over time (rate, latency, count per bucket) | lineChart |
| Top-N tables, last errors, per-entity listings | dataTable |
Don't use other widget types unless the user asks.
Wait for the user to approve or adjust the plan before emitting JSON.
Phase 4: Generate the Coralogix JSON
Produce a single JSON document following references/widget-templates.md. Key rules:
- Top-level shape:
{ "id": "<21-char-nanoid>", "name": "<Dashboard Name>", "layout": { "sections": [ ... ] }, "variables": [], "variablesV2": [], "filters": [ ... ], "relativeTimeFrame": "<seconds>s", "annotations": [], "off": {}, "actions": [] } - IDs - fresh UUIDs for every
section,row,widget, and queryid. - Row height -
"appearance": { "height": 19 }unless there's a reason to change. - Section options - include
options.custom.name,collapsed, andcolor.predefined: "SECTION_PREDEFINED_COLOR_UNSPECIFIED". - Filters - one entry per slicing dimension from Phase 2. Default operator
equalswith emptyvaluesso users can fill in. UsenotEqualsfor environment exclusions (seereferences/widget-templates.md). - relativeTimeFrame - default
"172800s"(48h) unless the user specified otherwise.
For query syntax follow references/query-syntax.md; for the full query languages load references/dataprime-reference.md and references/promql-guidelines.md.
Phase 5: Live-verify every query through the cx CLI
Every PromQL and DataPrime query in the draft has to successfully run through cx before Phase 7. This catches invented metric names, typoed field paths, and malformed pipelines.
Frequent vs Archive (what / when / where in JSON)
What:
- Frequent (
TIER_FREQUENT_SEARCH): hot tier for fast search on recent logs/spans. - Archive (
TIER_ARCHIVE): cold tier for older logs/spans (long-term).
When to choose:
- Choose Frequent for on-call and recent investigations (hours/days).
- Choose Archive for long lookbacks (weeks/months) or when the time range is beyond hot retention.
The two languages are verified against different windows:
- PromQL: map
relativeTimeFrameto a$RANGEtoken (e.g.48hfor172800s), substitute${__range}with[$RANGE]for the CLI call, then restore${__range}in the JSON before Phase 6. Range vectors are window-sensitive, so the check has to match what the dashboard will evaluate. - DataPrime: verify against a fixed short window (
now-15m→now,--limit 1). The goal is syntax / field / pipeline validation, not data-presence on the dashboard's window — a short window is faster and a cleaner fail signal.
Full procedure (CLI invocations, $RANGE mapping table, retry budget, failure modes): references/verification.md.
If a query can't be made to pass within the retry budget, surface it to the user with the CLI error verbatim - don't ship a broken widget.
Phase 6: Self-verify structure
Run this checklist against the final JSON. Fix and re-check if any item fails before Phase 7.
Query syntax (dashboard-specific)
- Every PromQL range vector in a metrics widget uses
[${__range}]- never[$__range], never[5m](unless the panel is intentionally a sliding window). -
promqlQueryTypeisPROM_QL_QUERY_TYPE_INSTANTfor single-value widgets (gauge, pieChart, dataTable). Omitted forlineChart. - DataPrime log queries use
$d.message/$l.applicationname/ unquoted severity enums (full rules:references/dataprime-reference.md). - Every DataPrime widget query starts with
source logsorsource spans(dashboard widgets require the source prefix; Phase 5 verification strips it before handing the pipeline tocx logs/cx spans). - Success-rate denominators wrapped in
clamp_min(..., 1). - Histogram queries use the correct suffix (
_sum,_count,_bucket). - Widget queries are valid without the dashboard-level
filters- Coralogix injects them at render time.
Structure
- Each section has
id.value,rows, andoptions.custom. - Each row has
id.value,appearance.height, andwidgets. - Each widget has a unique
id.valueand adefinitionwith exactly one ofgauge/pieChart/lineChart/dataTable. - Success-rate gauges use
thresholdType: "THRESHOLD_TYPE_ABSOLUTE"with green at high values; error/DLQ gauges use red at high values. - "Total" / "stat" widgets are encoded as
gauge, not as a stat type. - Top-level
filtersincludes each slicing dimension from Phase 2. - All IDs are freshly generated UUIDs, unique within the document.
Content
- Dashboard name is descriptive (
"<Service> - <Purpose>"). - Widget titles are short, human-readable, and match what the query computes.
- The logs/errors section is
collapsed: trueunless the user said otherwise.
Phase 7: Deploy via cx dashboards create
Don't tell the user to paste JSON into the Coralogix UI - deploy it directly.
- List folders:
cx dashboards folders list -o json. - Suggest the best folder match (team, product area, or a folder named after the service). Default to root (omit
--folder) if nothing fits. - Write the verified JSON to a temp file and run
cx dashboards create --from-file /tmp/cx-dashboard-<slug>.json --folder <id>. The CLI generates therequestIdenvelope and prints the created dashboard ID.
Full procedure (folder-picking UX, command templates, idempotency note): references/deploy.md.
On failure: show the CLI error verbatim and return to Phase 5. The most common cause is a query that parses locally but the live API rejects.
Output format for the user
## Plan
<the approved Phase 3 plan>
## Verification
- PromQL queries verified: <N>/<N>
- DataPrime queries verified: <N>/<N>
## Deployed
- Dashboard: **<Name>**
- ID: `<id>`
- Folder: `<folder name or "root">`
- Profile: `<cx profile>`
The dashboard is live in Coralogix. Adjust filter values (e.g. `account_id`) after opening it.
References
- Dashboard query gotchas & cross-references:
references/query-syntax.md - Widget JSON templates:
references/widget-templates.md - Live-verification procedure:
references/verification.md - Deploy procedure:
references/deploy.md - DataPrime language reference:
references/dataprime-reference.md - PromQL reference:
references/promql-guidelines.md - Log querying patterns:
references/logs-querying.md - Span querying patterns:
references/spans-querying.md - Inline DataPrime help:
cx dataprime list,cx dataprime show <command> - Coralogix Custom Dashboards docs: https://www.coralogix.com/docs/user-guides/custom-dashboards/introduction/
Related Skills
cx-observability-setup- full monitoring setup workflow (views, webhooks, notifications, integrations)cx-incident-management- SLO and alert-connected dashboards, incident triagecx-telemetry-querying- discover the right telemetry signal before building dashboards
More from coralogix/cx-cli
cx-telemetry-querying
|
102cx-alerts
This skill should be used when the user asks to "manage alerts", "create alert", "list alerts", "check alert status", "enable alert", "disable alert", "investigate firing alerts", "check which alerts are active", "find alerting rules", "set up an alert", "configure alerting", "mute an alert", "silence an alert", "see alert definitions", "check alert priority", or wants to manage Coralogix alert definitions using the cx CLI.
95cx-observability-setup
>
87cx-incident-management
>
85cx-cost-optimization
>
79cx-data-pipeline
>
79