altinity-expert-clickhouse-metrics
Real-Time Metrics Monitoring
Real-time monitoring of ClickHouse metrics, events, and asynchronous metrics.
Diagnostics
Run all queries from the file checks.sql and analyze the results.
Ad-Hoc Query Guidelines
Key Tables
system.metrics- Current gauge valuessystem.events- Cumulative counters since restartsystem.asynchronous_metrics- System-level metricssystem.metric_log- Historical metricssystem.asynchronous_metric_log- Historical async metrics
Useful Patterns
-- Find metrics by pattern
select * from system.metrics where metric like '%pattern%'
select * from system.asynchronous_metrics where metric like '%pattern%'
select * from system.events where event like '%pattern%'
Cross-Module Triggers
| Finding | Load Module | Reason |
|---|---|---|
| High memory metrics | altinity-expert-clickhouse-memory |
Memory analysis |
| High replica delay | altinity-expert-clickhouse-replication |
Replication issues |
| High parts count | altinity-expert-clickhouse-merges |
Merge backlog |
| High load average | altinity-expert-clickhouse-reporting |
Query analysis |
| High connections | altinity-expert-clickhouse-reporting |
Connection analysis |
Monitoring Recommendations
Key Metrics to Alert On
| Metric | Warning | Critical |
|---|---|---|
ReadonlyReplica |
- | > 0 |
Query |
> 75% max | > 90% max |
MemoryResident |
> 80% RAM | > 90% RAM |
MaxPartCountForPartition |
> parts_to_delay | > parts_to_throw |
ReplicasMaxAbsoluteDelay |
> 5 min | > 1 hour |
LoadAverage1 |
> CPU count | > 2x CPU count |
Prometheus/Grafana Export
ClickHouse exposes metrics at :9363/metrics in Prometheus format when enabled.
More from altinity/skills
altinity-expert-clickhouse-schema
Analyze ClickHouse table structure, partitioning, ORDER BY keys, materialized views, and identify schema design anti-patterns. Use for table design issues and optimization.
64altinity-expert-clickhouse-logs
Analyze ClickHouse system log table health including TTL configuration, disk usage, freshness, and cleanup. Use for system log issues and TTL configuration.
57altinity-expert-clickhouse-ingestion
Diagnose ClickHouse INSERT performance, batch sizing, part creation patterns, and ingestion bottlenecks. Use for slow inserts and data pipeline issues.
56altinity-expert-clickhouse-merges
Diagnose ClickHouse merge performance, part backlog, and 'too many parts' errors. Use for merge issues and part management problems.
53altinity-expert-clickhouse-overview
Runs a quick overview of Clickhouse server health.
53altinity-expert-clickhouse-dictionaries
Analyze ClickHouse external dictionaries including configuration, memory usage, reload status, and performance. Use for dictionary issues and load failures.
53