dotnet-observability
dotnet-observability
Modern observability for .NET applications using OpenTelemetry, structured logging, health checks, and custom metrics. Covers the three pillars of observability (traces, metrics, logs), integration with Microsoft.Extensions.Diagnostics and System.Diagnostics, and production-ready health check patterns.
Out of scope: DI container mechanics and service lifetimes -- see [skill:dotnet-csharp-dependency-injection]. Async/await patterns -- see [skill:dotnet-csharp-async-patterns]. Testing observability output -- see [skill:dotnet-integration-testing] for verifying telemetry in integration tests. CI/CD pipeline integration for telemetry collection -- see [skill:dotnet-gha-patterns] and [skill:dotnet-ado-patterns]. Middleware pipeline patterns (request logging middleware, exception handling middleware) -- see [skill:dotnet-middleware-patterns].
Cross-references: [skill:dotnet-csharp-dependency-injection] for service registration, [skill:dotnet-csharp-async-patterns] for async patterns in background exporters, [skill:dotnet-resilience] for Polly telemetry integration, [skill:dotnet-middleware-patterns] for request/exception logging middleware.
OpenTelemetry Setup
OpenTelemetry is the standard observability framework in .NET. The .NET SDK includes native support for System.Diagnostics.Activity (traces) and System.Diagnostics.Metrics (metrics), which OpenTelemetry collects and exports.
Package Landscape
| Package | Purpose |
|---|---|
OpenTelemetry.Extensions.Hosting |
Host integration, lifecycle management |
OpenTelemetry.Instrumentation.AspNetCore |
Automatic HTTP server trace/metric instrumentation |
OpenTelemetry.Instrumentation.Http |
Automatic HttpClient trace/metric instrumentation |
OpenTelemetry.Instrumentation.Runtime |
GC, thread pool, assembly metrics |
OpenTelemetry.Exporter.OpenTelemetryProtocol |
OTLP exporter (gRPC/HTTP) for collectors |
OpenTelemetry.Exporter.Console |
Console exporter for local development |
Install the core stack:
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.*" />
<PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.*" />
<PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.*" />
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.*" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.*" />
Aspire Service Defaults Integration
If using .NET Aspire, the ServiceDefaults project configures OpenTelemetry automatically. This is the recommended approach for Aspire apps -- do not duplicate this configuration manually:
// ServiceDefaults/Extensions.cs (generated by Aspire)
public static IHostApplicationBuilder AddServiceDefaults(
this IHostApplicationBuilder builder)
{
builder.ConfigureOpenTelemetry();
builder.AddDefaultHealthChecks();
// ... other defaults
return builder;
}
For non-Aspire apps, configure OpenTelemetry explicitly as shown below.
Full Configuration (Non-Aspire)
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddOpenTelemetry()
.ConfigureResource(resource => resource
.AddService(
serviceName: builder.Environment.ApplicationName,
serviceVersion: typeof(Program).Assembly
.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
?.InformationalVersion ?? "unknown"))
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSource("MyApp.*") // Custom ActivitySources
.AddOtlpExporter())
.WithMetrics(metrics => metrics
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddRuntimeInstrumentation()
.AddMeter("MyApp.*") // Custom Meters
.AddOtlpExporter());
OTLP Configuration via Environment Variables
The OTLP exporter reads standard environment variables -- no code changes needed between environments:
# Collector endpoint (gRPC default)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# Or HTTP/protobuf
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
# Resource attributes
OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,service.namespace=myapp
# Service name (overrides code-based configuration)
OTEL_SERVICE_NAME=order-api
Distributed Tracing
How .NET Tracing Works
.NET uses System.Diagnostics.Activity as its native tracing primitive. OpenTelemetry maps these to spans:
| .NET Concept | OpenTelemetry Concept |
|---|---|
ActivitySource |
Tracer |
Activity |
Span |
Activity.SetTag |
Span attribute |
Activity.AddEvent |
Span event |
Activity.SetStatus |
Span status |
Custom Traces
public sealed class OrderService
{
// One ActivitySource per logical component, named after the namespace
private static readonly ActivitySource s_activitySource = new("MyApp.Orders");
public async Task<Order> CreateOrderAsync(
CreateOrderRequest request,
CancellationToken ct)
{
using var activity = s_activitySource.StartActivity(
"CreateOrder",
ActivityKind.Internal);
activity?.SetTag("order.customer_id", request.CustomerId);
activity?.SetTag("order.line_count", request.Lines.Count);
var order = new Order { /* ... */ };
activity?.AddEvent(new ActivityEvent("OrderValidated"));
await _db.Orders.AddAsync(order, ct);
await _db.SaveChangesAsync(ct);
activity?.SetTag("order.id", order.Id);
activity?.SetStatus(ActivityStatusCode.Ok);
return order;
}
}
Trace Context Propagation
W3C Trace Context is the default propagation format in .NET. It works automatically across HTTP boundaries with HttpClient:
// Trace context is automatically propagated via traceparent/tracestate headers
// when using HttpClient with OpenTelemetry.Instrumentation.Http.
// No manual propagation needed for HTTP-based communication.
For message-based communication (queues, event buses), propagate context explicitly:
// Producer: inject context into message headers
var propagator = Propagators.DefaultTextMapPropagator;
var carrier = new Dictionary<string, string>();
var currentActivity = Activity.Current;
if (currentActivity is not null)
{
propagator.Inject(
new PropagationContext(currentActivity.Context, Baggage.Current),
carrier,
(dict, key, value) => dict[key] = value);
}
// Attach carrier as message headers
// Consumer: extract context from message headers
var parentContext = propagator.Extract(
default,
messageHeaders,
(headers, key) => headers.TryGetValue(key, out var value)
? [value] : []);
using var activity = s_activitySource.StartActivity(
"ProcessMessage",
ActivityKind.Consumer,
parentContext.ActivityContext);
Metrics
Built-in Metrics
ASP.NET Core and HttpClient emit metrics automatically when OpenTelemetry instrumentation is configured:
| Meter | Key Metrics |
|---|---|
Microsoft.AspNetCore.Hosting |
http.server.request.duration, http.server.active_requests |
Microsoft.AspNetCore.Routing |
aspnetcore.routing.match_attempts |
System.Net.Http |
http.client.request.duration, http.client.active_requests |
System.Runtime |
process.runtime.dotnet.gc.collections.count, process.runtime.dotnet.threadpool.threads.count |
Custom Metrics
Use System.Diagnostics.Metrics for application-specific metrics:
public sealed class OrderMetrics
{
// One Meter per logical component
private readonly Counter<long> _ordersCreated;
private readonly Histogram<double> _orderProcessingDuration;
private readonly UpDownCounter<long> _activeOrders;
public OrderMetrics(IMeterFactory meterFactory)
{
var meter = meterFactory.Create("MyApp.Orders");
_ordersCreated = meter.CreateCounter<long>(
"myapp.orders.created",
unit: "{order}",
description: "Number of orders created");
_orderProcessingDuration = meter.CreateHistogram<double>(
"myapp.orders.processing_duration",
unit: "s",
description: "Time to process an order");
_activeOrders = meter.CreateUpDownCounter<long>(
"myapp.orders.active",
unit: "{order}",
description: "Number of orders currently being processed");
}
public void RecordOrderCreated(string region)
{
_ordersCreated.Add(1, new KeyValuePair<string, object?>("region", region));
}
public void RecordProcessingDuration(double seconds)
{
_orderProcessingDuration.Record(seconds);
}
public void IncrementActiveOrders() => _activeOrders.Add(1);
public void DecrementActiveOrders() => _activeOrders.Add(-1);
}
Register the metrics class in DI:
builder.Services.AddSingleton<OrderMetrics>();
Metric Naming Conventions
Follow the OpenTelemetry semantic conventions:
- Use lowercase with dots as separators:
myapp.orders.created - Use standard units from the spec:
s(seconds),ms(milliseconds),By(bytes),{request}(dimensionless) - Prefix with your application/service name:
myapp.* - Use consistent tag names across metrics:
region,status,order.type
Structured Logging
Microsoft.Extensions.Logging (Built-in)
The built-in logging framework supports structured logging natively. Use compile-time source generators for high-performance logging:
public static partial class Log
{
[LoggerMessage(
Level = LogLevel.Information,
Message = "Order {OrderId} created for customer {CustomerId} with {LineCount} items, total {Total:C}")]
public static partial void OrderCreated(
this ILogger logger,
string orderId,
string customerId,
int lineCount,
decimal total);
[LoggerMessage(
Level = LogLevel.Warning,
Message = "Order {OrderId} processing exceeded threshold: {Duration}ms")]
public static partial void OrderProcessingSlow(
this ILogger logger,
string orderId,
double duration);
[LoggerMessage(
Level = LogLevel.Error,
Message = "Failed to process order {OrderId}")]
public static partial void OrderProcessingFailed(
this ILogger logger,
Exception exception,
string orderId);
}
// Usage
logger.OrderCreated(order.Id, order.CustomerId, order.Lines.Count, order.Total);
Why Source-Generated Logging
- Zero allocation for disabled log levels (checked at call site)
- Compile-time validation of message templates and parameters
- Structured by default -- parameters become named properties in the log event
LoggerMessage.Define (Legacy / Pre-.NET 6)
Before source generators (.NET 5 and earlier), use LoggerMessage.Define to achieve the same zero-allocation benefits. This approach still works in modern .NET and is useful in non-partial classes or when targeting older frameworks:
public static class LogMessages
{
private static readonly Action<ILogger, string, int, Exception?> s_orderCreated =
LoggerMessage.Define<string, int>(
LogLevel.Information,
new EventId(1, nameof(OrderCreated)),
"Order {OrderId} created with {LineCount} items");
public static void OrderCreated(
ILogger logger, string orderId, int lineCount)
=> s_orderCreated(logger, orderId, lineCount, null);
private static readonly Action<ILogger, string, Exception?> s_orderFailed =
LoggerMessage.Define<string>(
LogLevel.Error,
new EventId(2, nameof(OrderFailed)),
"Failed to process order {OrderId}");
public static void OrderFailed(
ILogger logger, string orderId, Exception exception)
=> s_orderFailed(logger, orderId, exception);
}
Prefer [LoggerMessage] source generators for new code targeting .NET 6+. Use LoggerMessage.Define only when source generators are unavailable.
Message Templates: Do and Do Not
Message templates use named placeholders that become structured properties. This is fundamental to structured logging -- violations prevent log indexing and search.
// CORRECT: structured message template with named placeholders
logger.LogInformation("Order {OrderId} shipped to {City}", orderId, city);
// WRONG: string interpolation -- bypasses structured logging entirely
logger.LogInformation($"Order {orderId} shipped to {city}");
// WRONG: string concatenation -- same problem
logger.LogInformation("Order " + orderId + " shipped to " + city);
// WRONG: ToString() in template -- loses type information
logger.LogInformation("Order {OrderId} shipped at {Time}",
orderId, DateTime.UtcNow.ToString("o")); // pass DateTime directly
// CORRECT: pass objects directly, let the formatter handle rendering
logger.LogInformation("Order {OrderId} shipped at {ShippedAt}",
orderId, DateTime.UtcNow);
Log Level Best Practices
| Level | When to Use | Example |
|---|---|---|
Trace |
Detailed diagnostic info (method entry/exit, variable values) | Entering ProcessOrder with {OrderId} |
Debug |
Internal app state useful during development | Cache hit for product {ProductId} |
Information |
Normal application flow, business events | Order {OrderId} created successfully |
Warning |
Unexpected situations that do not prevent operation | Retry {Attempt} for external API call |
Error |
Failures that affect the current operation | Failed to save order {OrderId} |
Critical |
Application-wide failures requiring immediate action | Database connection pool exhausted |
Log Filtering (Microsoft.Extensions.Logging)
Configure log level filtering in appsettings.json to suppress noisy framework logs while keeping application logs at the desired level:
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning",
"Microsoft.AspNetCore.HttpLogging": "Information",
"Microsoft.EntityFrameworkCore.Database.Command": "Warning",
"System.Net.Http.HttpClient": "Warning",
"MyApp": "Debug"
},
"Console": {
"LogLevel": {
"Default": "Warning"
}
}
}
}
Key filtering rules:
- Most-specific category wins --
MyApp.OrdersmatchesMyAppif no more specific override exists - Provider-level overrides -- the
Consolesection above overrides the default for the console provider only - Environment overrides -- use
appsettings.Development.jsonto enableDebug/Tracelocally without affecting production
Log Scopes for Correlation
public async Task<Order> ProcessOrderAsync(
string orderId,
CancellationToken ct)
{
using var scope = _logger.BeginScope(
new Dictionary<string, object>
{
["OrderId"] = orderId,
["CorrelationId"] = Activity.Current?.TraceId.ToString() ?? ""
});
// All log messages within this scope include OrderId and CorrelationId
_logger.LogInformation("Starting order processing");
// ...
}
Serilog Integration
For advanced sinks (Elasticsearch, Seq, Datadog), Serilog is the standard structured logging library.
| Package | Purpose |
|---|---|
Serilog.AspNetCore |
UseSerilog() host integration + UseSerilogRequestLogging() |
Serilog.Settings.Configuration |
ReadFrom.Configuration() for appsettings.json binding |
Serilog.Sinks.OpenTelemetry |
WriteTo.OpenTelemetry() OTLP sink |
Serilog.Formatting.Compact |
RenderedCompactJsonFormatter for structured console output |
Serilog.Enrichers.Environment |
Enrich.WithMachineName() and Enrich.WithEnvironmentName() |
// Program.cs
builder.Host.UseSerilog((context, loggerConfiguration) =>
{
loggerConfiguration
.ReadFrom.Configuration(context.Configuration)
.Enrich.FromLogContext()
.Enrich.WithMachineName()
.Enrich.WithEnvironmentName()
.WriteTo.Console(new RenderedCompactJsonFormatter())
.WriteTo.OpenTelemetry(options =>
{
options.Endpoint = context.Configuration["OTEL_EXPORTER_OTLP_ENDPOINT"]
?? "http://localhost:4317";
options.Protocol = OtlpProtocol.Grpc;
});
});
// Use Serilog request logging instead of the built-in one
app.UseSerilogRequestLogging(options =>
{
options.EnrichDiagnosticContext = (diagnosticContext, httpContext) =>
{
diagnosticContext.Set("RequestHost", httpContext.Request.Host.Value);
diagnosticContext.Set("UserAgent", httpContext.Request.Headers.UserAgent.ToString());
};
});
Configure via appsettings.json:
{
"Serilog": {
"MinimumLevel": {
"Default": "Information",
"Override": {
"Microsoft.AspNetCore": "Warning",
"Microsoft.EntityFrameworkCore.Database.Command": "Warning",
"System.Net.Http.HttpClient": "Warning"
}
}
}
}
Choosing Between MS.Extensions.Logging and Serilog
| Scenario | Recommendation |
|---|---|
| Console + OTLP export only | Microsoft.Extensions.Logging + OpenTelemetry exporter |
| Need Elasticsearch, Seq, or Datadog sinks | Serilog |
| .NET Aspire application | Use the built-in logging (Aspire configures OTLP automatically) |
| High-throughput, minimal allocation | Source-generated LoggerMessage (works with both) |
Health Checks
Health checks enable orchestrators (Kubernetes, Docker, load balancers) to determine whether your application is ready to serve traffic.
Health Check Packages
The built-in Microsoft.Extensions.Diagnostics.HealthChecks package provides the core framework. Community packages from Xabaril/AspNetCore.Diagnostics.HealthChecks add provider-specific checks:
| Package | Extension Method |
|---|---|
AspNetCore.HealthChecks.Npgsql |
.AddNpgSql() |
AspNetCore.HealthChecks.Redis |
.AddRedis() |
AspNetCore.HealthChecks.Uris |
.AddUrlGroup() |
AspNetCore.HealthChecks.UI.Client |
UIResponseWriter.WriteHealthCheckUIResponse |
Basic Health Checks
builder.Services.AddHealthChecks()
.AddCheck("self", () => HealthCheckResult.Healthy(), tags: ["live"])
.AddNpgSql(
builder.Configuration.GetConnectionString("DefaultConnection")!,
name: "database",
tags: ["ready"])
.AddRedis(
builder.Configuration.GetConnectionString("Redis")!,
name: "redis",
tags: ["ready"])
.AddUrlGroup(
new Uri("https://api.external.com/health"),
name: "external-api",
tags: ["ready"]);
var app = builder.Build();
// Liveness: is the process running? (don't check dependencies)
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("live")
});
// Readiness: can the process serve traffic? (check dependencies)
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready"),
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
Custom Health Checks
public sealed class DiskSpaceHealthCheck(
IOptions<DiskSpaceOptions> options) : IHealthCheck
{
public Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken ct = default)
{
var drive = new DriveInfo(options.Value.DrivePath);
var freeSpaceMb = drive.AvailableFreeSpace / (1024 * 1024);
var data = new Dictionary<string, object>
{
["FreeSpaceMB"] = freeSpaceMb,
["DrivePath"] = options.Value.DrivePath
};
if (freeSpaceMb < options.Value.MinimumFreeSpaceMb)
{
return Task.FromResult(HealthCheckResult.Unhealthy(
$"Low disk space: {freeSpaceMb}MB remaining", data: data));
}
return Task.FromResult(HealthCheckResult.Healthy(
$"Disk space OK: {freeSpaceMb}MB free", data: data));
}
}
// Registration
builder.Services.AddHealthChecks()
.AddCheck<DiskSpaceHealthCheck>("disk-space", tags: ["ready"]);
Liveness vs Readiness
| Check | Purpose | Failure Action | Example |
|---|---|---|---|
Liveness (/health/live) |
Is the process healthy? | Restart container | Self-check, deadlock detection |
Readiness (/health/ready) |
Can the process serve traffic? | Remove from load balancer | Database, Redis, external APIs |
Important: Liveness checks should NOT include dependency checks. If a database is down, restarting your app will not fix the database. Liveness checks that fail on dependency issues cause cascading restarts.
Health Check Publishing
HealthCheckPublisherOptions controls the periodic evaluation schedule. To push results to monitoring systems, register an IHealthCheckPublisher implementation:
builder.Services.AddHealthChecks()
.AddCheck("self", () => HealthCheckResult.Healthy());
// Configure periodic evaluation schedule
builder.Services.Configure<HealthCheckPublisherOptions>(options =>
{
options.Delay = TimeSpan.FromSeconds(5); // Initial delay before first run
options.Period = TimeSpan.FromSeconds(30); // Interval between evaluations
});
// Register a publisher to push results (e.g., to logs, metrics, or external systems)
builder.Services.AddSingleton<IHealthCheckPublisher, LoggingHealthCheckPublisher>();
A minimal publisher that logs health status:
public sealed class LoggingHealthCheckPublisher(
ILogger<LoggingHealthCheckPublisher> logger) : IHealthCheckPublisher
{
public Task PublishAsync(
HealthReport report, CancellationToken ct)
{
logger.LogInformation(
"Health check: {Status} ({TotalDuration}ms)",
report.Status,
report.TotalDuration.TotalMilliseconds);
return Task.CompletedTask;
}
}
Putting It Together: Production Configuration
A complete observability setup for a production .NET API:
var builder = WebApplication.CreateBuilder(args);
// 1. OpenTelemetry -- traces, metrics, logs
builder.Services.AddOpenTelemetry()
.ConfigureResource(resource => resource
.AddService(builder.Environment.ApplicationName))
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSource("MyApp.*")
.AddOtlpExporter())
.WithMetrics(metrics => metrics
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddRuntimeInstrumentation()
.AddMeter("MyApp.*")
.AddOtlpExporter());
// 2. Structured logging with OpenTelemetry export
builder.Logging.AddOpenTelemetry(logging =>
{
logging.IncludeScopes = true;
logging.IncludeFormattedMessage = true;
logging.AddOtlpExporter();
});
// 3. Health checks
builder.Services.AddHealthChecks()
.AddCheck("self", () => HealthCheckResult.Healthy(), tags: ["live"])
.AddNpgSql(
builder.Configuration.GetConnectionString("DefaultConnection")!,
name: "database",
tags: ["ready"]);
// 4. Custom application metrics
builder.Services.AddSingleton<OrderMetrics>();
var app = builder.Build();
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("live")
});
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready")
});
app.Run();
Key Principles
- Use OpenTelemetry as the standard -- it provides vendor-neutral instrumentation that works with any backend (Prometheus, Grafana, Datadog, Azure Monitor, AWS X-Ray)
- Use
IMeterFactoryfrom DI -- do not createMeterinstances directly; the factory integrates with the DI lifecycle and OpenTelemetry registration - Use source-generated
LoggerMessagefor hot paths -- zero allocation when the log level is disabled - Separate liveness from readiness -- liveness checks should not include dependency health; readiness checks should
- Configure via environment variables -- OTLP endpoint, service name, and resource attributes should not be hardcoded
- Enrich logs with trace context -- structured logging with
TraceIdandSpanIdenables log-to-trace correlation - Follow OpenTelemetry semantic conventions for metric and span names
Agent Gotchas
- Do not create
MeterorActivitySourcevianewin DI-registered services without usingIMeterFactory-- instruments created outside the factory are not collected by the OpenTelemetry SDK. UseIMeterFactory.Create()forMeterinstances.ActivitySourceis static and registered via.AddSource(). - Do not add dependency checks to liveness endpoints -- a database outage should not restart the app. Only the readiness endpoint should check dependencies.
- Do not use
ILogger.LogInformation("message: " + value)or string interpolation$"message: {value}"-- use structured logging templates:ILogger.LogInformation("message: {Value}", value). String concatenation and interpolation bypass structured logging and prevent log indexing. - Do not configure OTLP endpoints in code for production -- use environment variables (
OTEL_EXPORTER_OTLP_ENDPOINT) so the same image works across environments. - Do not forget to register custom
ActivitySourcenames with.AddSource("MyApp.*")-- unregistered sources are silently ignored and produce no traces.
References
- OpenTelemetry .NET documentation
- .NET observability with OpenTelemetry
- ASP.NET Core health checks
- System.Diagnostics.Metrics
- High-performance logging in .NET
- Logging in .NET: Log filtering
- Serilog OpenTelemetry sink
Attribution
Adapted from Aaronontheweb/dotnet-skills (MIT license).