Set up tracing, logging, and monitoring for deployed ADK agents across Cloud Trace, BigQuery, and third-party platforms.

Four observability tiers: Cloud Trace (always enabled, distributed tracing), Prompt-Response Logging (GenAI interactions to GCS/BigQuery), BigQuery Agent Analytics (structured agent events), and third-party integrations (AgentOps, Phoenix, MLflow, Weave, Freeplay, and others)
For Agent Runtime deployments, run agents-cli infra single-project before first deploy to provision Terraform-managed infrastructure (service account, GCS bucket, BigQuery dataset); post-deployment setup requires manual IAM and env var configuration
Cloud Trace works out of the box with OpenTelemetry spans tracking invocation flow, LLM calls, and tool execution; accessible via Cloud Console Trace explorer
Prompt-response logging is privacy-preserving by default (metadata only via OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=NO_CONTENT); disabled locally unless LOGS_BUCKET_NAME is set
Includes troubleshooting guide covering missing traces, privacy misconfiguration, BigQuery setup, and cost optimization strategies

ADK Observability Guide

Cloud Trace works out of the box — no infrastructure needed. Prompt-response logging and BigQuery Agent Analytics require Terraform-provisioned infrastructure (service account, GCS bucket, BigQuery dataset). Run agents-cli infra single-project --project PROJECT_ID to provision these resources. See references/cloud-trace-and-logging.md for details, env vars, and verification commands. If your project isn't scaffolded yet, see /google-agents-cli-scaffold first.

Order of operations for `agent_runtime` deployments

For deployment_target = agent_runtime, run agents-cli infra single-project before the first agents-cli deploy. The Terraform module owns the entire Reasoning Engine resource (display_name, service account, deployment spec, env vars), so applying it after a SDK-based deploy creates a state mismatch — Terraform has no record of the SDK-deployed instance and cannot layer env vars onto it without taking ownership of the whole resource.

If you have already run agents-cli deploy, you have two options:

Switch to Terraform-managed. Delete the SDK-deployed Reasoning Engine, then run agents-cli infra single-project followed by agents-cli deploy. Sessions and any in-flight state on the previous instance are lost.
Keep the SDK-deployed instance. Skip infra single-project and set the observability env vars on the running instance directly via the vertexai client update API. You will also need to grant the instance's service account the IAM permissions required to emit telemetry — writing to the logs GCS bucket, BigQuery dataset access, log writer, etc. See deployment/terraform/single-project/iam.tf and telemetry.tf in your scaffolded project for the full set of bindings the Terraform module would otherwise provision. Terraform-managed env vars are not available in this mode.

Reference Files

File	Contents
`references/cloud-trace-and-logging.md`	Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally
`references/bigquery-agent-analytics.md`	BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance

google-agents-cli-observability

ADK Observability Guide

Order of operations for agent_runtime deployments

Reference Files

Order of operations for `agent_runtime` deployments