langchain-observability

SKILL.md

LangChain Observability

Contents

Overview

Set up comprehensive observability for LangChain applications with LangSmith, OpenTelemetry, Prometheus, and structured logging.

Prerequisites

  • LangChain application in staging/production
  • LangSmith account (optional but recommended)
  • Prometheus/Grafana infrastructure
  • OpenTelemetry collector (optional)

Instructions

Step 1: Enable LangSmith Tracing

Set LANGCHAIN_TRACING_V2=true and configure LANGCHAIN_API_KEY and LANGCHAIN_PROJECT environment variables. All chains are automatically traced.

Step 2: Add Prometheus Metrics

Create a PrometheusCallback handler that tracks langchain_llm_requests_total, langchain_llm_latency_seconds, and langchain_llm_tokens_total counters/histograms.

Step 3: Integrate OpenTelemetry

Use OTLPSpanExporter with a custom OpenTelemetryCallback to add spans for chain and LLM operations with parent-child relationships.

Step 4: Configure Structured Logging

Use structlog with a StructuredLoggingCallback to emit JSON logs for all LLM start/end/error events.

Step 5: Set Up Grafana Dashboard

Create panels for request rate, P95 latency, token usage, and error rate using Prometheus queries.

Step 6: Configure Alerting Rules

Define Prometheus alerts for high error rate (>5%), high latency (P95 >5s), and token budget exceeded.

See detailed implementation for complete callback code, dashboard JSON, and alert rules.

Output

  • LangSmith tracing enabled
  • Prometheus metrics exported
  • OpenTelemetry spans
  • Structured logging
  • Grafana dashboard and alerts

Error Handling

Issue Cause Solution
Missing metrics Callback not attached Pass callback to LLM constructor
Trace gaps Missing context propagation Check parent span handling
Alert storms Thresholds too sensitive Tune for duration and thresholds

Examples

Basic usage: Apply langchain observability to a standard project setup with default configuration options.

Advanced scenario: Customize langchain observability for production environments with multiple constraints and team-specific requirements.

Resources

Next Steps

Use langchain-incident-runbook for incident response procedures.

Weekly Installs
16
GitHub Stars
1.6K
First Seen
Feb 18, 2026
Installed on
gemini-cli16
github-copilot16
amp16
codex16
kimi-cli16
opencode16