loki-config-generator
Loki Configuration Generator
Overview
Generate production-ready Grafana Loki server configurations with best practices. Supports monolithic, simple scalable, and microservices deployment modes with S3, GCS, Azure, or filesystem storage.
Current Stable: Loki 3.6.2 (November 2025) Important: Promtail deprecated in 3.4 - use Grafana Alloy instead. See
examples/grafana-alloy.yamlfor log collection configuration.
When to Use
Invoke when: deploying Loki, creating configs from scratch, migrating to Loki, implementing multi-tenant logging, configuring storage backends, or optimizing existing deployments.
Generation Methods
Method 1: Script Generation (Recommended)
Use scripts/generate_config.py for consistent, validated configurations:
# Simple Scalable with S3 (production)
python3 scripts/generate_config.py \
--mode simple-scalable \
--storage s3 \
--bucket my-loki-bucket \
--region us-east-1 \
--retention-days 30 \
--otlp-enabled \
--output loki-config.yaml
# Monolithic with filesystem (development)
python3 scripts/generate_config.py \
--mode monolithic \
--storage filesystem \
--no-auth-enabled \
--output loki-dev.yaml
# Production with Thanos storage (Loki 3.4+)
python3 scripts/generate_config.py \
--mode simple-scalable \
--storage s3 \
--thanos-storage \
--otlp-enabled \
--time-sharding \
--output loki-thanos.yaml
Script Options:
| Option | Description |
|---|---|
--mode |
monolithic, simple-scalable, microservices |
--storage |
filesystem, s3, gcs, azure |
--auth-enabled / --no-auth-enabled |
Explicitly enable/disable auth |
--otlp-enabled |
Enable OTLP ingestion configuration |
--thanos-storage |
Use Thanos object storage client (3.4+, cloud backends) |
--time-sharding |
Enable out-of-order ingestion (simple-scalable) |
--ruler |
Enable alerting/recording rules (not monolithic) |
--horizontal-compactor |
main/worker mode (simple-scalable, 3.6+) |
--zone-awareness |
Enable multi-AZ placement safeguards |
--limits-dry-run |
Log limit rejections without enforcing |
Method 2: Manual Configuration
Follow the staged workflow below when script generation doesn't meet specific requirements or when learning the configuration structure.
Output Formats
For Kubernetes deployments, generate BOTH formats:
- Native Loki config (
loki-config.yaml) - For ConfigMap or direct use - Helm values (
values.yaml) - For Helm chart deployments
See examples/kubernetes-helm-values.yaml for Helm format.
Documentation Lookup
When to Use Context7/Web Search
REQUIRED - Use Context7 MCP for:
- Configuring features from Loki 3.4+ (Thanos storage, time sharding)
- Configuring features from Loki 3.6+ (horizontal compactor, enforced labels)
- Bloom filter configuration (complex, experimental)
- Custom OTLP attribute mappings beyond standard patterns
- Troubleshooting configuration errors
OPTIONAL - Skip documentation lookup for:
- Standard deployment modes (monolithic, simple-scalable)
- Basic storage configuration (S3, GCS, Azure, filesystem)
- Default limits and component settings
- Configurations covered in
references/directory
Context7 MCP (preferred)
resolve-library-id: "grafana loki"
get-library-docs: /websites/grafana_loki, topic: [component]
Example topics: storage_config, limits_config, otlp, compactor, ruler, bloom
Web Search Fallback
Use when Context7 unavailable: "Grafana Loki 3.6 [component] configuration documentation site:grafana.com"
Configuration Workflow
Stage 1: Gather Requirements
Deployment Mode:
| Mode | Scale | Use Case |
|---|---|---|
| Monolithic | <100GB/day | Testing, development |
| Simple Scalable | 100GB-1TB/day | Production |
| Microservices | >1TB/day | Large-scale, multi-tenant |
Storage Backend: S3, GCS, Azure Blob, Filesystem, MinIO
Key Questions: Expected log volume? Retention period? Multi-tenancy needed? High availability requirements? Kubernetes deployment?
Ask the user directly if required information is missing.
Stage 2: Schema Configuration (CRITICAL)
For all new deployments (Loki 2.9+), use TSDB with v13 schema:
schema_config:
configs:
- from: "2025-01-01" # Use deployment date
store: tsdb
object_store: s3 # s3, gcs, azure, filesystem
schema: v13
index:
prefix: loki_index_
period: 24h
Key: Schema cannot change after deployment without migration.
Stage 3: Storage Configuration
S3:
common:
storage:
s3:
s3: s3://us-east-1/loki-bucket
s3forcepathstyle: false
GCS: gcs: { bucket_name: loki-bucket }
Azure: azure: { container_name: loki-container, account_name: ${AZURE_ACCOUNT_NAME} }
Filesystem: filesystem: { chunks_directory: /loki/chunks, rules_directory: /loki/rules }
Stage 4: Component Configuration
Ingester:
ingester:
chunk_encoding: snappy
chunk_idle_period: 30m
max_chunk_age: 2h
chunk_target_size: 1572864 # 1.5MB
lifecycler:
ring:
replication_factor: 3 # 3 for production
Querier:
querier:
max_concurrent: 4
query_timeout: 1m
Compactor:
compactor:
working_directory: /loki/compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
Stage 5: Limits Configuration
limits_config:
ingestion_rate_mb: 10
ingestion_burst_size_mb: 20
max_streams_per_user: 10000
max_entries_limit_per_query: 5000
max_query_length: 721h
retention_period: 30d
allow_structured_metadata: true
volume_enabled: true
Stage 6: Server & Auth
server:
http_listen_port: 3100
grpc_listen_port: 9096
log_level: info
auth_enabled: true # false for single-tenant
Stage 7: OTLP Ingestion (Loki 3.0+)
Native OpenTelemetry ingestion - use otlphttp exporter (NOT deprecated lokiexporter):
limits_config:
allow_structured_metadata: true
otlp_config:
resource_attributes:
attributes_config:
- action: index_label # Low-cardinality only!
attributes: [service.name, service.namespace, deployment.environment]
- action: structured_metadata # High-cardinality
attributes: [k8s.pod.name, service.instance.id]
Actions: index_label (searchable, low-cardinality), structured_metadata (queryable), drop
⚠️ NEVER use
k8s.pod.nameas index_label - use structured_metadata instead.
OTel Collector:
exporters:
otlphttp:
endpoint: http://loki:3100/otlp
Stage 8: Caching
chunk_store_config:
chunk_cache_config:
memcached_client:
host: memcached-chunks
timeout: 500ms
query_range:
cache_results: true
results_cache:
cache:
memcached_client:
host: memcached-results
Stage 9: Advanced Features
Pattern Ingester (3.0+):
pattern_ingester:
enabled: true
Bloom Filters (Experimental, 3.3+): Only for >75TB/month deployments. Works on structured metadata only. See examples/ for config.
Time Sharding (3.4+): For out-of-order ingestion:
limits_config:
shard_streams:
time_sharding_enabled: true
Thanos Storage (3.4+): New storage client, opt-in now, default later:
storage_config:
use_thanos_objstore: true
object_store:
s3:
bucket_name: my-bucket
endpoint: s3.us-west-2.amazonaws.com
Stage 10: Ruler (Alerting)
ruler:
storage:
type: s3
s3: { bucket_name: loki-ruler }
alertmanager_url: http://alertmanager:9093
enable_api: true
enable_sharding: true
Stage 11: Loki 3.6 Features
- Horizontally Scalable Compactor:
horizontal_scaling_mode: main|worker - Policy-Based Enforced Labels:
enforced_labels: [service.name] - FluentBit v4:
structured_metadataparameter support
Stage 12: Validate Configuration (REQUIRED)
Always validate before deployment:
# Syntax and parameter validation
loki -config.file=loki-config.yaml -verify-config
# Print resolved configuration (shows defaults)
loki -config.file=loki-config.yaml -print-config-stderr 2>&1 | head -100
# Dry-run with Docker (if Loki not installed locally)
docker run --rm -v $(pwd)/loki-config.yaml:/etc/loki/config.yaml \
grafana/loki:3.6.2 -config.file=/etc/loki/config.yaml -verify-config
Validation Checklist:
- No syntax errors from
-verify-config - Schema uses
tsdbandv13 -
replication_factor: 3for production -
auth_enabled: trueif multi-tenant - Storage credentials/IAM configured
- Retention period matches requirements
Production Checklist
High Availability Requirements
Zone-Aware Replication (CRITICAL for production multi-AZ deployments):
When using replication_factor: 3, ALWAYS enable zone-awareness for multi-AZ deployments:
ingester:
lifecycler:
ring:
replication_factor: 3
zone_awareness_enabled: true # CRITICAL for multi-AZ
# Set zone via environment variable or config
# Each pod should set its zone based on node topology
common:
instance_availability_zone: ${AVAILABILITY_ZONE}
Why: Without zone-awareness, all 3 replicas may land in the same AZ. If that AZ fails, you lose data.
Kubernetes Implementation:
# In Helm values or pod spec
env:
- name: AVAILABILITY_ZONE
valueFrom:
fieldRef:
fieldPath: metadata.labels['topology.kubernetes.io/zone']
TLS Configuration (Production Required)
Enable TLS for all inter-component and client communication:
server:
http_tls_config:
cert_file: /etc/loki/tls/tls.crt
key_file: /etc/loki/tls/tls.key
client_ca_file: /etc/loki/tls/ca.crt # For mTLS
grpc_tls_config:
cert_file: /etc/loki/tls/tls.crt
key_file: /etc/loki/tls/tls.key
client_ca_file: /etc/loki/tls/ca.crt
See examples/production-tls.yaml for complete TLS configuration.
Production Checklist Summary
| Requirement | Setting | Required For |
|---|---|---|
replication_factor: 3 |
common block | All production |
zone_awareness_enabled: true |
ingester.lifecycler.ring | Multi-AZ |
auth_enabled: true |
root level | Multi-tenant |
| TLS enabled | server block | All production |
| IAM roles (not keys) | storage config | Cloud storage |
| Caching enabled | chunk_store_config, query_range | Performance |
| Pattern ingester | pattern_ingester.enabled | Observability |
| Retention configured | compactor + limits_config | Cost control |
Monitoring Recommendations
Key Metrics to Monitor
Configure Prometheus to scrape Loki metrics and alert on these critical indicators:
# Prometheus scrape config
- job_name: 'loki'
static_configs:
- targets: ['loki:3100']
Critical Alerts
groups:
- name: loki-critical
rules:
# Ingestion failures
- alert: LokiIngestionFailures
expr: sum(rate(loki_distributor_ingester_append_failures_total[5m])) > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Loki ingestion failures detected"
# High stream cardinality (performance killer)
- alert: LokiHighStreamCardinality
expr: loki_ingester_memory_streams > 100000
for: 10m
labels:
severity: warning
annotations:
summary: "High stream cardinality - review labels"
# Compaction not running (retention broken)
- alert: LokiCompactionStalled
expr: time() - loki_compactor_last_successful_run_timestamp_seconds > 7200
for: 5m
labels:
severity: critical
annotations:
summary: "Loki compaction stalled - retention not enforced"
# Query latency
- alert: LokiSlowQueries
expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket{route=~"loki_api_v1_query.*"}[5m])) by (le)) > 30
for: 10m
labels:
severity: warning
annotations:
summary: "Loki query P99 latency > 30s"
# Ingester memory pressure
- alert: LokiIngesterMemoryHigh
expr: container_memory_usage_bytes{container="ingester"} / container_spec_memory_limit_bytes{container="ingester"} > 0.8
for: 10m
labels:
severity: warning
annotations:
summary: "Loki ingester memory usage > 80%"
Key Metrics Reference
| Metric | Description | Action Threshold |
|---|---|---|
loki_ingester_memory_streams |
Active streams in memory | >100k: review cardinality |
loki_distributor_ingester_append_failures_total |
Ingestion failures | >0: investigate immediately |
loki_request_duration_seconds |
Query latency | P99 >30s: add caching/queriers |
loki_ingester_chunks_flushed_total |
Chunk flush rate | Low rate: check ingester health |
loki_compactor_last_successful_run_timestamp_seconds |
Last compaction | >2h ago: compaction broken |
Grafana Dashboard
Import official Loki dashboards:
- Dashboard ID:
13407- Loki Logs - Dashboard ID:
14055- Loki Operational
Log Collection with Grafana Alloy
Promtail is deprecated (support ends Feb 2026). Use Grafana Alloy for new deployments.
Basic Alloy Configuration
See examples/grafana-alloy.yaml for complete configuration.
// Kubernetes log discovery
discovery.kubernetes "pods" {
role = "pod"
}
// Relabeling for Kubernetes metadata
discovery.relabel "pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_namespace"]
target_label = "namespace"
}
rule {
source_labels = ["__meta_kubernetes_pod_name"]
target_label = "pod"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_name"]
target_label = "container"
}
}
// Log collection
loki.source.kubernetes "pods" {
targets = discovery.relabel.pods.output
forward_to = [loki.write.default.receiver]
}
// Send to Loki
loki.write "default" {
endpoint {
url = "http://loki-gateway.loki.svc.cluster.local/loki/api/v1/push"
// For multi-tenant
tenant_id = "default"
}
}
Migration from Promtail
# Convert Promtail config to Alloy
alloy convert --source-format=promtail --output=alloy-config.alloy promtail.yaml
Complete Examples
See examples/ directory for full configurations:
monolithic-filesystem.yaml- Development/testingsimple-scalable-s3.yaml- Production with S3microservices-s3.yaml- Large-scale distributedmulti-tenant.yaml- Multi-tenant with per-tenant limitsproduction-tls.yaml- TLS-enabled production configgrafana-alloy.yaml- Log collection with Alloykubernetes-helm-values.yaml- Helm chart values
Minimal Monolithic:
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2025-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: loki_index_
period: 24h
limits_config:
retention_period: 30d
allow_structured_metadata: true
compactor:
working_directory: /loki/compactor
retention_enabled: true
Helm Deployment
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki -f values.yaml
Generate both native config and Helm values for Kubernetes deployments.
# values.yaml
deploymentMode: SimpleScalable
loki:
schemaConfig:
configs:
- from: "2025-01-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
limits_config:
retention_period: 30d
allow_structured_metadata: true
# Zone awareness for HA
ingester:
lifecycler:
ring:
zone_awareness_enabled: true
backend:
replicas: 3
# Spread across zones
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
read:
replicas: 3
write:
replicas: 3
Best Practices
Performance:
chunk_encoding: snappy,chunk_target_size: 1572864- Enable caching (chunks, results)
parallelise_shardable_queries: true
Security:
auth_enabled: truewith reverse proxy auth- IAM roles for cloud storage (never hardcode keys)
- TLS for all communications (see Production Checklist)
Reliability:
replication_factor: 3for productionzone_awareness_enabled: truefor multi-AZ (see Production Checklist)- Persistent volumes for ingesters
- Monitor ingestion rate and query latency (see Monitoring section)
Limits: Set ingestion_rate_mb, max_streams_per_user to prevent overload
Common Issues
| Issue | Solution |
|---|---|
| High ingester memory | Reduce max_streams_per_user, lower chunk_idle_period |
| Slow queries | Increase max_concurrent, enable parallelization, add caching |
| Ingestion failures | Check ingestion_rate_mb, verify storage connectivity |
| Storage growing fast | Enable retention, check compression, review cardinality |
| Data loss in AZ failure | Enable zone_awareness_enabled: true |
| Config validation fails | Run loki -verify-config, check YAML syntax |
Deprecated (Migrate Away)
boltdb-shipper→tsdblokiexporter→otlphttp- Promtail → Grafana Alloy (support ends Feb 2026)
Resources
scripts/generate_config.py - Generate configs programmatically (RECOMMENDED) examples/ - Complete configuration examples for all modes references/ - Full parameter reference and best practices
Related Skills
- logql-generator - LogQL query generation
- fluentbit-generator - Log collection to Loki