exa-observability
Exa Observability
Overview
Monitor Exa AI search API performance, result quality, and cost efficiency. Key metrics include search latency (Exa neural search typically takes 500-2000ms), result relevance (measured by click-through or downstream usage), search volume by type (neural vs keyword vs auto), per-search cost tracking, and cache hit rates for repeated queries.
Prerequisites
- Exa API integration in production
- Metrics backend (Prometheus, Datadog, or equivalent)
- Request logging infrastructure
Instructions
Step 1: Instrument the Exa Client
import Exa from 'exa-js';
async function trackedSearch(exa: Exa, query: string, options: any) {
const start = performance.now();
try {
const results = await exa.search(query, options);
const duration = performance.now() - start;
emitHistogram('exa_search_duration_ms', duration, { type: options.type || 'auto' });
emitCounter('exa_searches_total', 1, { type: options.type || 'auto', status: 'success' });
emitGauge('exa_results_count', results.results.length, { type: options.type || 'auto' });
return results;
} catch (err: any) {
emitCounter('exa_searches_total', 1, { status: 'error', code: err.status });
throw err;
}
}
Step 2: Track Result Quality
// Measure whether search results are actually used by downstream consumers
function trackResultUsage(searchId: string, resultIndex: number, action: 'clicked' | 'used_in_context' | 'discarded') {
emitCounter('exa_result_usage', 1, { action, position: String(resultIndex) });
// Results at position 0-2 should have high usage; if not, query needs tuning
}
Step 3: Monitor Search Budget
set -euo pipefail
# Check remaining search quota
curl -s https://api.exa.ai/v1/usage \
-H "x-api-key: $EXA_API_KEY" | \
jq '{searches_today, searches_this_month, monthly_limit, budget_remaining_pct: (1 - .searches_this_month / .monthly_limit) * 100}'
Step 4: Configure Alerts
groups:
- name: exa
rules:
- alert: ExaHighLatency
expr: histogram_quantile(0.95, rate(exa_search_duration_ms_bucket[5m])) > 3000 # 3000: 3 seconds in ms
annotations: { summary: "Exa search P95 latency exceeds 3 seconds" }
- alert: ExaBudgetLow
expr: exa_monthly_searches_remaining < 1000 # 1000: 1 second in ms
annotations: { summary: "Exa monthly search budget nearly exhausted" }
- alert: ExaLowResultQuality
expr: rate(exa_result_usage{action="discarded"}[1h]) / rate(exa_result_usage[1h]) > 0.5
annotations: { summary: "Over 50% of Exa search results being discarded" }
- alert: ExaApiErrors
expr: rate(exa_searches_total{status="error"}[5m]) > 0.1
annotations: { summary: "Exa API errors detected" }
Step 5: Build a Search Efficiency Dashboard
Key panels: search volume by type (neural/keyword/auto), latency p50/p95, results per search distribution, result usage rate (used vs discarded), daily cost tracking, and cache hit rate. Low result counts with high latency indicate poorly formed queries.
Error Handling
| Issue | Cause | Solution |
|---|---|---|
429 Too Many Requests |
Rate limit exceeded | Implement exponential backoff and request queue |
| Zero results returned | Query too specific or domain filter too narrow | Broaden query, remove includeDomains filter |
| Latency spike to 5s+ | Neural search on complex query | Use type: "keyword" for simpler lookups |
| Monthly budget exhausted | Uncapped search volume | Add application-level search budget tracking |
Examples
Basic usage: Apply exa observability to a standard project setup with default configuration options.
Advanced scenario: Customize exa observability for production environments with multiple constraints and team-specific requirements.
Output
- Configuration files or code changes applied to the project
- Validation report confirming correct implementation
- Summary of changes made and their rationale
Resources
- Official monitoring documentation
- Community best practices and patterns
- Related skills in this plugin pack