altinity-expert-clickhouse-schema
SKILL.md
Table Schema and Design Analysis
Analyze table structure, partitioning, ORDER BY, materialized views, and identify design anti-patterns.
Diagnostics
Run all queries from the file checks.sql and analyze the results.
Deep Dive Queries (Placeholder-Based)
Partition Distribution for Specific Table
select
database,
table,
count() as partitions,
sum(rows) as total_rows,
formatReadableSize(sum(bytes_on_disk)) as total_size,
formatReadableSize(median(bytes_on_disk)) as median_partition_size,
min(partition) as oldest_partition,
max(partition) as newest_partition
from system.parts
where active and database = '{database}' and table = '{table}'
group by database, table, partition
order by partition desc
limit 100
Column Compression Analysis for Specific Table
select
name,
type,
formatReadableSize(data_compressed_bytes) as compressed,
formatReadableSize(data_uncompressed_bytes) as uncompressed,
round(data_uncompressed_bytes / nullIf(data_compressed_bytes, 0), 2) as ratio,
compression_codec
from system.columns
where database = '{database}' and table = '{table}'
order by data_compressed_bytes desc
limit 50
Look for:
- Columns with ratio < 2 → consider better codec or data transformation
- Large columns without codec → add CODEC(ZSTD) or LZ4HC
- String columns with low cardinality → consider LowCardinality(String)
Index Usage Analysis for Specific Database
select
database,
table,
name as index_name,
type,
expr,
granularity
from system.data_skipping_indices
where database = '{database}'
order by database, table
Schema Design Recommendations
Partition Key Guidelines
| Data Volume | Recommended Granularity | Example |
|---|---|---|
| < 10GB/month | No partitioning or yearly | toYear(ts) |
| 10-100GB/month | Monthly | toYYYYMM(ts) |
| 100GB-1TB/month | Weekly or daily | toMonday(ts) |
| > 1TB/month | Daily | toDate(ts) |
ORDER BY Guidelines
- First column: Low cardinality, frequently filtered (e.g.,
tenant_id,region) - Second column: Time-based if range queries common
- Subsequent: Other filter columns by selectivity (most selective last)
Anti-patterns:
- UUID/hash as first column
- High-cardinality ID without tenant prefix
- DateTime64 with microseconds as first column
Compression Codec Recommendations
| Data Type | Recommended Codec |
|---|---|
| Integers (sequential) | Delta, ZSTD |
| Integers (random) | ZSTD or LZ4HC |
| Floats | Gorilla, ZSTD |
| Timestamps | DoubleDelta, ZSTD |
| Strings (long) | ZSTD(3) |
| Strings (repetitive) | LowCardinality + ZSTD |
Cross-Module Triggers
| Finding | Load Module | Reason |
|---|---|---|
| Many small partitions | altinity-expert-clickhouse-ingestion |
Check batch sizing |
| Oversized partitions | altinity-expert-clickhouse-merges |
Merge can't complete |
| High PK memory | altinity-expert-clickhouse-memory |
Memory pressure |
| MV performance issues | altinity-expert-clickhouse-reporting |
Query analysis |
| Too many parts per partition | altinity-expert-clickhouse-merges |
Merge backlog |
Settings Reference
| Setting | Default | Recommendation |
|---|---|---|
index_granularity |
8192 | Lower for point lookups, higher for scans |
ttl_only_drop_parts |
0 | Set to 1 if TTL deletes entire partitions |
min_bytes_for_wide_part |
10MB | Increase if many small parts |
Weekly Installs
52
Repository
altinity/skillsGitHub Stars
5
First Seen
Jan 24, 2026
Security Audits
Installed on
codex48
claude-code43
gemini-cli29
opencode28
github-copilot25
cursor23