Query Tracing Skill

Use distributed tracing (OpenTelemetry / Tempo) to understand where time is spent during SQL statement execution in a local Materialize instance.

For background on Materialize's tracing infrastructure (span levels, filtering, #[tracing::instrument], distributed context propagation), see doc/developer/tracing.md.

Important: Use Optimized Builds

Always trace --optimized builds. Debug builds introduce significant artifacts — unoptimized code, extra debug checks, and inflated self-times that don't reflect production behavior. Tracing a debug build can be misleading because time may appear in places that are negligible in release builds.

bin/environmentd --optimized --monitoring

The --optimized build takes longer to compile but produces traces that accurately represent where time is actually spent.

Prerequisites

The local monitoring stack (Tempo + Grafana + Prometheus) and environmentd with tracing enabled must be running. All network calls require dangerouslyDisableSandbox.

Important: Start the monitoring stack before environmentd. If Tempo is not running when environmentd starts sending spans, those spans are lost — they will not be retroactively captured.

1. Start the monitoring stack

bin/mzcompose --find monitoring run default

Wait for Tempo to be ready:

# Poll until Tempo is accepting requests
for i in $(seq 1 30); do
    curl -s -o /dev/null -w "%{http_code}" http://localhost:3200/ready | grep -q 200 && break
    sleep 5
done

2. Start environmentd with tracing

bin/environmentd --optimized --monitoring

Or with the trace filter pre-set:

bin/environmentd --optimized --reset --monitoring -- --system-parameter-default='opentelemetry_filter=debug'

3. Verify both services are running

Before tracing, confirm the stack is healthy:

# Tempo accepting traces?
curl -s -o /dev/null -w "%{http_code}" http://localhost:3200/ready
# environmentd listening?
lsof -tiTCP:6875 -sTCP:LISTEN

4. Set the trace filter at runtime (if not set at startup)

Connect as mz_system and set the OpenTelemetry filter:

psql -U mz_system -h localhost -p 6877 materialize -c "ALTER SYSTEM SET opentelemetry_filter = 'debug';"

debug is a good starting filter. For even more detail, use trace (but expect much more data). You can also target specific modules:

ALTER SYSTEM SET opentelemetry_filter = 'mz_adapter::coord=trace,debug';

Workflow: Trace a Query

Step 1: Run the query and capture the trace ID

Use psql with emit_trace_id_notice enabled. Important: Use separate -c flags for SET and the query, since DDL statements (CREATE, DROP, ALTER) cannot run inside an implicit transaction block:

psql -U materialize -h localhost -p 6875 materialize \
  -c "SET emit_trace_id_notice = true;" \
  -c "<YOUR SQL STATEMENT HERE>;"

The output will contain NOTICE lines with trace IDs — one per statement. The SET statement gets its own trace ID; the second trace ID is the one you want:

SET
NOTICE:  trace id: fd2f69eb059e4823d8d48a87ccab8f6f
NOTICE:  trace id: 984bd8c69f99703243d350e14b02caea
CREATE TABLE

The trace ID for the query is the last one before the statement result.

To extract the trace ID programmatically:

OUTPUT=$(psql -U materialize -h localhost -p 6875 materialize \
  -c "SET emit_trace_id_notice = true;" \
  -c "CREATE TABLE foo (id INT);" 2>&1)
TRACE_ID=$(echo "$OUTPUT" | grep "trace id:" | tail -1 | sed 's/.*trace id: //')
echo "Trace ID: $TRACE_ID"

Step 2: Wait for spans to be flushed

Spans are batched before export (default 5s delay, configured in --opentelemetry-sched-delay). Wait ~10 seconds after the query completes.

Step 3: Fetch the trace from Tempo

Query the Tempo HTTP API on port 3200. Retry if needed — spans may still be in flight:

sleep 10
# Retry loop — first attempt often gets "trace not found"
for i in $(seq 1 5); do
    RESP=$(curl -s http://localhost:3200/api/traces/$TRACE_ID)
    if echo "$RESP" | grep -q "batches"; then
        echo "$RESP" > /tmp/claude-1000/trace.json
        echo "Trace fetched ($(echo "$RESP" | wc -c) bytes)"
        break
    fi
    echo "Trace not ready, retrying... ($i/5)"
    sleep 5
done

Step 4: Analyze the trace

The Tempo API returns traces in OTLP JSON format. Important: Span IDs and parent span IDs are base64-encoded, not hex. The analysis script handles this.

Use the trace_tree.py script in this skill directory to analyze traces:

python3 .claude/skills/mz-query-tracing/trace_tree.py /tmp/claude-1000/trace.json "My Query"

Interpreting Results

Self-time is where actual work happens. A span with 100ms duration but 0ms self-time is just a wrapper — its children do all the work.
group_commit_apply::append_fut — Persist append operations, often dominant for DDL. Time here is spent writing to the durable log.
catalog::transact_inner — Catalog transaction processing. High self-time means catalog state manipulation.
oracle::write_ts / oracle::apply_write — Timestamp oracle calls to CockroachDB. Each is a round-trip.
consensus::compare_and_set / consensus::scan — Persist consensus operations (also CockroachDB round-trips).
coord::check_consistency — Post-DDL consistency check. Self-time here is pure computation.
Sequential vs. concurrent children — children with overlapping time ranges are concurrent; non-overlapping are sequential.
Code locations in [src/...] brackets let you jump directly to the source code.
For spans not listed above, read the source code at the location shown in the [src/...] bracket to understand what the span does and why it's taking time.

Tips

Use --optimized builds. Debug builds inflate self-times with unoptimized code paths and extra debug checks (e.g., check_consistency may appear much larger than in production). Always trace optimized builds for meaningful results.
Start monitoring before environmentd. Spans emitted before Tempo is ready are silently dropped. If you restart environmentd, make sure Tempo is still running.
If traces are empty or incomplete, the opentelemetry_filter may be too restrictive. Try debug or even trace.
The first fetch after a query often returns trace not found. The default batch delay is 5s, but ingestion adds latency. Always use a retry loop.
Tempo retains traces for only 15 minutes by default (configured in misc/monitoring/tempo.yml). Fetch traces promptly.
To search for recent traces: curl -s "http://localhost:3200/api/search?limit=10"
You can also view traces in Grafana at http://localhost:3000 by searching for the trace ID in the Tempo datasource explore view.
For comparing before/after a code change, trace the same query on both builds and compare the self-time rankings side by side.

mz-query-tracing

Query Tracing Skill

Important: Use Optimized Builds

Prerequisites

1. Start the monitoring stack

2. Start environmentd with tracing

3. Verify both services are running

4. Set the trace filter at runtime (if not set at startup)

Workflow: Trace a Query

Step 1: Run the query and capture the trace ID

Step 2: Wait for spans to be flushed

Step 3: Fetch the trace from Tempo

Step 4: Analyze the trace

Interpreting Results

Tips

More from materializeinc/materialize

materialize-docs

mz-run

mz-test

mz-commit

mz-profile

mz-benchmark