apm-service-dependencies
Installation
SKILL.md
APM Service Dependencies
This tool answers "how is my application wired together?" It returns the APM dependency graph — a set of directed edges from caller services to callee services, with protocol (http, grpc, dns, etc.), call volume, and per-service health (span count, latency, errors) when requested.
Prerequisites
- Elastic APM with OTel-instrumented services producing
span.destination.service.resourcevalues. - Optional: Kubernetes metadata on the spans (for namespace filtering).
If the user is log-only or metrics-only (no APM), this tool won't work. Do not reach for it.
Tools
| Tool | Purpose |
|---|---|
apm-service-dependencies |
Fetch the dependency graph (full or focal). |
apm-health-summary |
Prerequisite view: which services are degraded? Then map their neighborhood with this skill. |
ml-anomalies |
Drill into anomalies affecting a service discovered in the graph. |
k8s-blast-radius |
If a service is K8s-deployed and a node is implicated. |
How to call apm-service-dependencies
Focal mode (most common)
Use when you know which service is the investigation target. Returns only that service's direct upstream and downstream neighbors — much easier to reason about than the full graph.
{
"service": "checkout",
"lookback": "1h",
"include_health": true
}
Full-graph mode
Use sparingly — only when the user explicitly wants the whole topology, or during initial environment discovery.
{
"lookback": "1h",
"include_health": false
}
Parameter-filling guidance:
service: the exact OTelservice.nameas deployed — typically lowercase and hyphenated for multi-word services (frontend,checkout,product-catalog). Do not concatenate spaces — if the user says "checkout service" passcheckout, notcheckoutservice. If the name is ambiguous, ask the user to confirm before calling. If the tool returns no edges for the named focal service, confirm the name with the user before fuzzy-matching.namespace: only if the user scopes to a K8s namespace AND services are K8s-deployed.lookback: default1h. Use15mfor "right now,"24hto smooth transient topology changes.include_health: default true. Set false for a topology-only response when you don't need latency/error data.
After the tool returns
Response shape:
services: list of nodes with optional language/deployment/namespace metadata and health stats.edges: directed edges withsource,target,protocol,port,call_count,avg_latency_us.focal_service+upstream+downstream(focal mode only).service_count/edge_count.data_coverage_note(only on focal mode when the focal service has inbound but zero outbound edges): flags a likely instrumentation gap — don't claim the service is aleaf; relay the note to the user as an advisory.
Lead your narrative with:
- Focal service: "checkoutservice has 3 upstream callers and 5 downstream dependencies."
- Upstream callers (who depends on this service): name them, note call volumes. Outage here cascades up.
- Downstream dependencies (what this service relies on): name them. Problems here cascade in.
- Hot edges: highest call volume or latency — likely the load-bearing paths.
- Follow-ups: suggest
ml-anomalieson a specific neighbor if its health shows errors or elevated latency.
Key principles
- Prefer focal mode. The full graph is hard to narrate; a focal subgraph is crisp.
- Direction matters. Upstream = who calls me (blast radius goes up). Downstream = what I call (problems cascade in). Don't mix them up in explanations.
- Protocols and ports are clues. A DNS edge tells you the callee is resolved by name (k8s service?). A high-port HTTP call to a specific target hints at a sidecar or proxy.
- Empty or tiny graphs are a signal. If the focal service has zero edges, either the lookback is too narrow, the service isn't instrumented, or the name is wrong. Do not silently report "no dependencies."