add-provider
Add Provider
Orchestrates adding a new Grafana product provider — from API discovery through verified implementation. Four stages with human approval gates.
When to Use
- User wants to add CLI support for a Grafana Cloud product
- User says "add provider", "new provider", "integrate [product]"
- A bead task references provider implementation
When NOT to use: If the product exposes a K8s-compatible /apis endpoint,
it already works with gcx resources — no provider needed.
First: Check references/decision-tree.md to confirm a provider is the
right approach.
Workflow
Discover ──gate──> Design ──gate──> Implement ──gate──> Verify
│ │ │ │
v v v v
research report ADRs + spec code per stage smoke tests
| Stage | Deliverable | Gate |
|---|---|---|
| 1. Discover | docs/research/ report |
User approves findings |
| 2. Design | ADRs + spec + smoke test plan | User approves design |
| 3. Implement | Code (one stage at a time) | make all passes per stage |
| 4. Verify | Smoke tests + architecture doc updates | All checks green |
Prerequisites
Confirm with the user before starting:
- Product name — which Grafana product to integrate
- Access — do they have a running Grafana instance with the product enabled?
- Scope — full provider or single resource type first?
Stage 1: Discover
Guide:
docs/reference/provider-discovery-guide.mdSections 1.1–1.6
1a. Gather User Context
Before autonomous research, ask what the user already knows:
- Source code access — which repo?
- API documentation — OpenAPI specs, Grafana docs URLs?
- Terraform resources — does the Terraform provider support this product?
- Go SDK — existing Go client library?
- Known quirks — non-standard auth, async ops, unusual pagination?
Use answers to skip known areas and focus research on gaps.
1b. Research
Follow provider-discovery-guide.md Sections 1.1–1.6:
- Map API surface (base path, auth, endpoints, pagination)
- Check existing tooling (Terraform schemas, Go SDK)
- Inspect source code (undocumented endpoints, enum values)
- Identify auth model
- Map resource relationships
- Test API behavior with real calls
1c. Write Research Report
Write findings to docs/research/YYYY-MM-DD-{product}-provider.md using
the template at docs/_templates/research.md. Must include:
- API endpoints and response shapes discovered
- Auth model analysis
- Resource relationships
- At least one successful API call result
- Confidence assessment per finding
Gate: User Approves Research
Present the research report. Do not proceed to design until approved.
Stage 2: Design
Guide:
docs/reference/provider-discovery-guide.mdSection 2
2a. Design Decisions
Answer each decision from the guide, grounded in research findings:
- Auth strategy — reuse Grafana token or separate credentials?
- Client type — plugin API, K8s API, or external service?
- Envelope mapping — how do API objects map to K8s envelope?
- Command surface — CRUD + which beyond-CRUD commands?
- Package layout — flat or subpackaged?
- Staging — how to break into shippable stages?
For beyond-CRUD commands: brainstorm based on real APIs found in research (status, timeline, validation, etc.). Present options to user — include "CRUD only for now" as an option.
2b. Write ADRs
For each significant decision, write an ADR in
docs/adrs/{product}-provider/NNN-{decision}.md using the template at
docs/_templates/adr.md. At minimum, create ADRs for:
- Auth strategy choice
- Client type choice (plugin API vs K8s vs external)
Other decisions can be captured in the spec if they're straightforward.
2c. Write Spec
Write the implementation plan in docs/specs/{product}-provider/:
- Top-level plan with all stages, file tree, and decisions summary
- Per-stage docs with scope, files to create, and acceptance criteria
Reference implementations for plan structure:
- SLO:
docs/specs/slo-provider/2026-03-04-slo-provider-plan.md - Synth:
docs/specs/synth-provider/2026-03-06-synth-provider-plan.md
2d. Write Smoke Test Plan
Every stage doc MUST include a Verification section with concrete smoke test commands using real values (not placeholders). These are executed in Stage 4 after implementation.
Example pattern (replace with real product/resource names in actual spec):
# Provider appears in list
gcx providers | grep {name}
# Config secrets are redacted
gcx config view | grep {name}
# CRUD operations work
gcx {name} {resource} list
gcx {name} {resource} get <test-id>
gcx {name} {resource} push ./testdata/{resource}.yaml
gcx {name} {resource} pull -d ./tmp/
gcx {name} {resource} delete <test-id> --yes
# Unified resources path works
gcx resources get {alias}
Gate: User Approves Design
Present ADRs and spec. Do not proceed to implementation until approved.
Stage 3: Implement
Guide:
docs/reference/provider-guide.md(Steps 1–7) UX Guide:docs/reference/design-guide.md(all[CURRENT]and[ADOPT]items)
Implement one stage at a time per the approved spec. Each stage's doc is self-contained enough to resume in a fresh session.
If /build-spec or /build-task skills are available, use them to drive
implementation. Otherwise, follow provider-guide.md Steps 1–7 directly.
Summary of the key steps:
- Provider interface +
init()+configLoader(copy from SLO reference) - Config keys + validation
- Commands with UX compliance
- Types + client + adapter per resource type
- Register (blank import + ResourceAdapter)
- Tests (interface compliance, adapter round-trip, client httptest)
Key patterns (see provider-guide.md for details):
- Hand-roll HTTP client (~200 LOC) — don't use generated OpenAPI clients
- Copy full
configLoaderfrominternal/providers/slo/provider.go - Config key names use hyphen-case
- Adapter must strip server-generated fields on Create/Update
Gate: Stage Complete
Per stage: make all passes, no regressions.
Stage 4: Verify
4a. Run Smoke Tests
Execute every smoke test command from the Stage 2d verification plan against a real Grafana instance. Record results (pass/fail + output).
4b. Run Checklists
From docs/reference/design-guide.md Section 7 and docs/reference/provider-guide.md:
Interface: All 5 Provider methods, Name() lowercase/unique, ConfigKeys
complete, secrets marked, Validate returns actionable errors, blank import added.
UX: -o json/yaml support, text table default, actionable error suggestions,
no os.Exit(), cmdio status messages, help text standards, push idempotent,
format-agnostic data fetching, promql-builder for PromQL.
Build: make all, gcx providers lists it, config view redacts.
4c. Update Architecture Docs
Follow docs/reference/doc-maintenance.md structural checks — a new provider
adds packages to internal/ and commands to cmd/, so architecture docs
need updating.
Gate: All Green
All smoke tests pass, all checklists green, docs updated.
Reference Implementations
| Provider | Auth Model | API Type | Key Entry Point |
|---|---|---|---|
| SLO | Same Grafana token | Plugin API | internal/providers/slo/provider.go |
| Synth | Separate URL + token | External service | internal/providers/synth/provider.go |
Spec plans: docs/specs/slo-provider/, docs/specs/synth-provider/
Common Pitfalls
| Pitfall | Mitigation |
|---|---|
| K8s CRDs not externally accessible | Verify with real API call before choosing K8s client |
| Incomplete OpenAPI specs | Cross-reference with source code route handlers |
| configLoader is non-trivial | Copy full impl from SLO, don't simplify |
| Missing blank import | Add _ ".../{name}" in cmd/gcx/root/command.go |
| readOnly fields in POST/PUT | Adapter must strip server-generated fields |
More from grafana/gcx
gcx
>
5explore-datasources
Discover what datasources, metrics, labels, and log streams are available in a Grafana instance. Use when the user asks what data exists, what metrics are available, what services are being monitored, or needs to find a datasource UID.
4setup-gcx
>
3gcx-observability
>
3slo-check-status
Use when the user asks about SLO health, wants an overview of all SLOs, or needs status of a specific SLO. Trigger on phrases like "how are my SLOs doing", "SLO status", "check my SLOs", "is my SLO healthy", "SLO budget", "SLO burn rate". For investigating breaching SLOs use slo-investigate. For optimization suggestions use slo-optimize. For creating or modifying SLO definitions use slo-manage.
2slo-investigate
Use when a specific SLO is breaching or alerting and the user needs to understand why — root cause analysis, dimensional breakdown, alert rule correlation, runbook access. Trigger on phrases like "investigate SLO", "why is my SLO breaching", "SLO error budget burning", "SLO alerting". For SLO status overview use slo-check-status. For creating or modifying SLOs use slo-manage. For optimization suggestions use slo-optimize.
2