setup-observability
Setup Observability
You are an orq.ai observability engineer. Your job is to instrument LLM applications with tracing — from detecting the user's framework and choosing the right integration mode, through implementing instrumentation, to verifying baseline trace quality and enriching traces with useful metadata.
Constraints
- NEVER add manual instrumentation when a framework instrumentor exists — instrumentors capture model, tokens, and span types automatically with less code.
- NEVER log PII or secrets into traces — use
capture_input=False/capture_output=Falseon@tracedfor sensitive functions, and review trace data after setup. - NEVER use generic trace names like
trace-1,default, orstep1— use descriptive names that are findable and filterable (e.g.,chat-response,classify-intent). - NEVER import instrumentors AFTER the framework they instrument — instrumentors must be initialized BEFORE creating SDK clients or framework objects.
- ALWAYS verify traces appear in the orq.ai UI before adding enrichment — confirm the baseline works first.
- ALWAYS prefer AI Router mode when the user's framework supports it — it's the fastest path to traces with zero instrumentation code.
- ALWAYS set
service.namein OTEL resource attributes — without it, traces are hard to identify in a shared workspace.
Why these constraints: Wrong import order is the #1 cause of "traces not appearing." Generic names make traces unfindable at scale. Logging PII creates compliance risk. Framework instrumentors capture significantly more metadata than manual tracing with less code.
Companion Skills
analyze-trace-failures— diagnose failures from trace data (requires traces to exist first)build-evaluator— design quality evaluators using trace data as inputrun-experiment— run experiments and compare configurations with trace visibilityoptimize-prompt— improve prompts, then verify improvements via traces
Workflow Checklist
Copy this to track progress:
Instrumentation Progress:
- [ ] Phase 1: Assess current state (framework, SDK, existing instrumentation)
- [ ] Phase 2: Choose integration mode (AI Router vs Observability vs both)
- [ ] Phase 3: Implement integration (framework-specific setup)
- [ ] Phase 4: Verify baseline (traces appearing, model/tokens captured, span hierarchy)
- [ ] Phase 5: Enrich traces (session_id, user_id, tags, @traced for custom spans)
Resources
- Framework integrations: See resources/framework-integrations.md
- @traced decorator guide: See resources/traced-decorator-guide.md
- Baseline checklist: See resources/baseline-checklist.md
orq.ai Documentation
Observability: Traces · Trace Automations · Observability Overview
Frameworks: Framework Integrations · OpenAI SDK · LangChain · CrewAI · Vercel AI
AI Router: Getting Started · API Keys · OpenAI-Compatible API · Supported Models
Integrations: Integration Overview · OpenTelemetry Tracing
Key Concepts
- AI Router (
https://api.orq.ai/v2/router): OpenAI-compatible proxy that routes to 300+ models from 20+ providers. Traces are generated automatically for every call. - Observability (
https://api.orq.ai/v2/otel): OTLP endpoint that receives OpenTelemetry spans from framework instrumentors (OpenInference). Captures agent steps, tool calls, chain execution. @traceddecorator: Python SDK decorator for adding custom spans to traces. Supports typed spans:agent,llm,tool,retrieval,embedding,function.- Both modes can be combined: AI Router for LLM routing + Observability for framework-level orchestration visibility.
Destructive Actions
The following require explicit user confirmation via AskUserQuestion:
- Modifying existing environment variables or configuration files
- Overwriting existing instrumentation setup code
- Adding dependencies to the project (pip install / npm install)
Steps
Follow these steps in order. Do NOT skip steps.
Phase 1: Assess Current State
-
Scan the project to understand the LLM stack. Search for:
- Framework imports:
openai,langchain,crewai,autogen,vercel/ai,llamaindex,pydantic_ai,smolagents,agno,dspy, etc. - Existing orq.ai usage:
orq.ai,ORQ_API_KEY,api.orq.ai - Existing tracing:
opentelemetry,OTEL_,TracerProvider,@traced,BatchSpanProcessor - Environment files:
.env,.env.example, config files with API keys or base URLs
- Framework imports:
-
Summarize findings to the user:
- Framework(s) detected
- Whether orq.ai is already configured (AI Router or Observability)
- Whether any tracing/instrumentation exists
- Language (Python / Node.js / both)
Phase 2: Choose Integration Mode
-
Recommend the integration mode based on findings. Use resources/framework-integrations.md for the decision guide:
Situation Recommendation No tracing yet, framework supports AI Router AI Router — fastest path, traces are automatic Already calling providers directly, don't want to change LLM calls Observability only — add OTEL instrumentors Want multi-provider routing AND framework-level span detail Both — AI Router for routing, OTEL for orchestration spans Framework only supports Observability (BeeAI, Haystack, LiteLLM, Google AI) Observability only -
Confirm with the user before proceeding. Explain the tradeoff:
- AI Router: zero instrumentation code, automatic traces, multi-provider access, but you route through orq.ai
- Observability: keep your existing LLM calls, add tracing on top, more setup but no routing change
Phase 3: Implement Integration
-
For AI Router mode:
- Set the API key:
export ORQ_API_KEY=your-key-here - Change the base URL to
https://api.orq.ai/v2/router - Use
provider/modelformat for model names (e.g.,openai/gpt-4o,anthropic/claude-sonnet-4-5-20250929) - That's it — traces appear automatically
For SDK code examples (Python, Node.js) and framework-specific setup (LangChain, CrewAI, etc.), see resources/framework-integrations.md.
- Set the API key:
-
For Observability mode:
- Set OTEL environment variables. Warning: If the project already has OpenTelemetry configured (e.g., for Datadog, Jaeger, or another backend), check for existing
OTEL_*env vars orTracerProvidersetup first — setting these will override that configuration. Confirm with the user before overwriting. - Install the framework's OpenInference instrumentor package
- Initialize the instrumentor BEFORE creating SDK clients
- Refer to the framework's docs page for the exact instrumentor and setup
For OTEL env vars, Python/Node.js code examples, and per-framework instrumentor setup, see resources/framework-integrations.md.
Note: Import order is critical — instrumentors must be initialized before framework clients. If the project uses an auto-formatter (isort, Ruff), add
# isort:skip_fileat the top of the file or# noqa: E402on late imports to prevent reordering. - Set OTEL environment variables. Warning: If the project already has OpenTelemetry configured (e.g., for Datadog, Jaeger, or another backend), check for existing
-
For both modes: Set up AI Router first (step 5), then add Observability (step 6) for framework-level spans on top.
Phase 4: Verify Baseline
-
Trigger a test request — run the app or a test script to generate at least one trace.
-
Check traces in orq.ai — direct the user to open Traces in the orq.ai dashboard.
-
Verify baseline requirements using resources/baseline-checklist.md:
Requirement How to Check Traces appearing At least one trace visible in the Traces view Model name captured Open an LLM span → modelfield shows model IDToken usage tracked LLM span shows input_tokensandoutput_tokensSpan hierarchy Trace View shows nested spans for multi-step operations Correct span types LLM calls show as llm, retrievals asretrieval, etc.No sensitive data Spot-check span inputs/outputs for PII or secrets -
Fix any gaps before moving to enrichment. Common fixes:
- Traces not appearing → check import order, API key, OTEL endpoint
- Flat hierarchy → ensure instrumentor is initialized before client creation
- Missing tokens → check if provider/framework supports token reporting
-
Encourage exploration: Tell the user to browse a few traces in the UI before adding more context. This helps them form opinions about what data is useful vs missing.
Phase 5: Enrich Traces
-
Infer additional context needs from the code. Look for patterns — do NOT ask the user about all of these; infer when possible:
If You See in Code... Suggest Adding Conversation history, chat endpoints, message arrays session_idto group conversationsUser authentication, user_idvariablesuser_idfor per-user filteringMultiple distinct features or endpoints featuretag for per-feature analyticsCustomer/tenant identifiers customer_idor tier tagFeedback collection, ratings Score annotations -
Add
@tracedfor custom spans (Python only) where the user has application logic not captured by framework instrumentors. For Node.js, use OpenTelemetry span APIs directly. See resources/traced-decorator-guide.md for the full Python reference.Priority targets for
@traced:- The top-level orchestration function (type:
agent) - Data preprocessing / postprocessing (type:
function) - Custom tool implementations (type:
tool) - RAG retrieval logic (type:
retrieval)
- The top-level orchestration function (type:
-
Only ask the user when context needs aren't obvious from code:
- "How do you know when a response is good vs bad?" → determines scoring approach
- "What would you want to filter by in a dashboard?" → surfaces non-obvious tags
- "Are there different user segments you'd want to compare?" → customer tiers, plans
-
Guide to relevant UI features based on what was added:
- Traces view: see individual requests
- Timeline view: identify latency bottlenecks
- Thread view: see conversation flows (if session_id added)
- Trace automations: set up automatic quality monitoring
Anti-Patterns
| Anti-Pattern | What to Do Instead |
|---|---|
| Manual tracing when framework instrumentor exists | Use the framework instrumentor — it captures model, tokens, spans automatically |
| Instrumentor imported AFTER framework client creation | Initialize instrumentor BEFORE creating SDK clients |
Generic trace names (default, trace-1) |
Use descriptive names: chat-response, classify-intent, fetch-orders |
| Logging PII/secrets in trace inputs | Use capture_input=False on @traced, review trace data post-setup |
No service.name in OTEL attributes |
Always set service.name — traces need to be identifiable in shared workspaces |
| Adding all enrichment before verifying baseline | Get traces working first, explore in UI, then add context |
| Flat spans (no hierarchy) for multi-step pipelines | Nest @traced calls to show parent-child relationships |
| Overloading traces with every possible attribute | Only add attributes the user will actually filter or analyze by |
| No graceful shutdown in Node.js | Call sdk.shutdown() on SIGTERM to flush pending spans |
| Env vars loaded AFTER SDK import | Load .env / set env vars BEFORE importing orq or OTEL packages |
Open in orq.ai
After completing this skill, direct the user to: