LangChain Observability

Overview
Prerequisites
Instructions
Output
Error Handling
Examples
Resources

Overview

Set up comprehensive observability for LangChain applications with LangSmith, OpenTelemetry, Prometheus, and structured logging.

Prerequisites

LangChain application in staging/production
LangSmith account (optional but recommended)
Prometheus/Grafana infrastructure
OpenTelemetry collector (optional)

Instructions

Step 1: Enable LangSmith Tracing

Set LANGCHAIN_TRACING_V2=true and configure LANGCHAIN_API_KEY and LANGCHAIN_PROJECT environment variables. All chains are automatically traced.

Step 2: Add Prometheus Metrics

Create a PrometheusCallback handler that tracks langchain_llm_requests_total, langchain_llm_latency_seconds, and langchain_llm_tokens_total counters/histograms.

Step 3: Integrate OpenTelemetry

Use OTLPSpanExporter with a custom OpenTelemetryCallback to add spans for chain and LLM operations with parent-child relationships.

Step 4: Configure Structured Logging

Use structlog with a StructuredLoggingCallback to emit JSON logs for all LLM start/end/error events.

Step 5: Set Up Grafana Dashboard

Create panels for request rate, P95 latency, token usage, and error rate using Prometheus queries.

Step 6: Configure Alerting Rules

Define Prometheus alerts for high error rate (>5%), high latency (P95 >5s), and token budget exceeded.

See detailed implementation for complete callback code, dashboard JSON, and alert rules.

Output

LangSmith tracing enabled
Prometheus metrics exported
OpenTelemetry spans
Structured logging
Grafana dashboard and alerts

Error Handling

Issue	Cause	Solution
Missing metrics	Callback not attached	Pass callback to LLM constructor
Trace gaps	Missing context propagation	Check parent span handling
Alert storms	Thresholds too sensitive	Tune `for` duration and thresholds

Examples

Basic usage: Apply langchain observability to a standard project setup with default configuration options.

Advanced scenario: Customize langchain observability for production environments with multiple constraints and team-specific requirements.

Resources

Next Steps

Use langchain-incident-runbook for incident response procedures.

langchain-observability