opentelemetry
OpenTelemetry Implementation Guide
Overview
OpenTelemetry (OTel) is a vendor-neutral observability framework for instrumenting, generating, collecting, and exporting telemetry data (traces, metrics, logs). This skill provides guidance for implementing OTEL in Kubernetes environments.
Quick Start
Deploy OTEL Collector on Kubernetes
# Add Helm repo
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
# Install with basic config
helm install otel-collector open-telemetry/opentelemetry-collector \
--namespace monitoring --create-namespace \
--set mode=daemonset
Send Test Data via OTLP
# gRPC endpoint: 4317, HTTP endpoint: 4318
curl -X POST http://otel-collector:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{"resourceSpans":[]}'
Core Concepts
Signals: Three types of telemetry data:
- Traces: Distributed request flows across services
- Metrics: Numerical measurements (counters, gauges, histograms)
- Logs: Event records with structured/unstructured data
Collector Components:
- Receivers: Accept data (OTLP, Prometheus, Jaeger, Zipkin)
- Processors: Transform data (batch, memory_limiter, k8sattributes)
- Exporters: Send data (prometheusremotewrite, loki, otlp)
- Extensions: Add capabilities (health_check, pprof, zpages)
Collector Configuration
Basic Pipeline Structure
config:
receivers:
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
exporters:
prometheusremotewrite:
endpoint: "http://prometheus:9090/api/v1/write"
loki:
endpoint: "http://loki:3100/loki/api/v1/push"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheusremotewrite]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [loki]
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp/tempo]
Kubernetes Attributes Enrichment
processors:
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
filter:
node_from_env_var: ${env:K8S_NODE_NAME}
extract:
metadata:
- k8s.pod.name
- k8s.namespace.name
- k8s.deployment.name
- k8s.node.name
Deployment Modes
| Mode | Use Case | Pros | Cons |
|---|---|---|---|
| DaemonSet | Node-level collection | Full coverage, host metrics | Higher resource usage |
| Deployment | Centralized gateway | Scalable, easier management | Single point of failure |
| Sidecar | Per-pod collection | Isolated, fine-grained | Resource overhead per pod |
Common Patterns
Development Environment
- Enable debug exporter for visibility
- Lower resource limits (250m CPU, 512Mi memory)
- Include spot instance tolerations for cost savings
Production Environment
- Implement sampling (10-50% for traces)
- Higher batch sizes (2048-4096)
- Enable autoscaling and PodDisruptionBudget
- Use TLS for all endpoints
Detailed References
For in-depth guidance, see:
- Collector Configuration: COLLECTOR.md
- Kubernetes Deployment: KUBERNETES.md
- Troubleshooting: TROUBLESHOOTING.md
- Instrumentation: INSTRUMENTATION.md
Validation Commands
# Check collector pods
kubectl get pods -n monitoring -l app.kubernetes.io/name=otel-collector
# View collector logs
kubectl logs -n monitoring -l app.kubernetes.io/name=otel-collector --tail=100
# Test OTLP endpoint
kubectl run test-otlp --image=curlimages/curl:latest --rm -it -- \
curl -v http://otel-collector.monitoring:4318/v1/traces
# Validate config syntax
otelcol validate --config=config.yaml
Key Helm Chart Values
mode: "daemonset" # or "deployment"
presets:
logsCollection:
enabled: true
hostMetrics:
enabled: true
kubernetesAttributes:
enabled: true
kubeletMetrics:
enabled: true
useGOMEMLIMIT: true
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 100m
memory: 256Mi
More from julianobarbosa/claude-code-skills
obsidian-vault-management
Creates, edits, and manages Obsidian vault content including notes, templates, daily notes, and dataview queries. Use when working with markdown files in an Obsidian vault, creating notes, writing templates, building dataview queries, or organizing knowledge management content.
189neovim
Comprehensive guide for this Neovim configuration - a modular, performance-optimized Lua-based IDE. Use when configuring plugins, adding keybindings, setting up LSP servers, debugging, or extending the configuration. Covers lazy.nvim, 82+ plugins across 9 categories, DAP debugging, AI integrations, and performance optimization.
152markitdown-skill
Guide for using Microsoft MarkItDown - a Python utility for converting files to Markdown. Use when converting PDF, Word, PowerPoint, Excel, images, audio, HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs, Jupyter notebooks, RSS feeds, or Wikipedia pages to Markdown format. Also use for document processing pipelines, LLM preprocessing, or text extraction tasks.
149obsidian
>-
137zabbix
Zabbix monitoring system automation via API and Python. Use when: (1) Managing hosts, templates, items, triggers, or host groups, (2) Automating monitoring configuration, (3) Sending data via Zabbix trapper/sender, (4) Querying historical data or events, (5) Bulk operations on Zabbix objects, (6) Maintenance window management, (7) User/permission management
131researching-web
Search the web using Perplexity AI. Use when needing to search, look up, research, find current information, best practices, compare technologies, or answer factual questions about tools and libraries.
116