analyzing-schema-change-storage-risk
Analyzing Schema Change Storage Risk
Estimates the storage headroom needed to safely run online schema changes. Mirrors the official guidance: some operations may temporarily require up to 3× the size of the affected table or index while the schema change is in flight.
For ongoing range-distribution monitoring, see analyzing-range-distribution.
When to Use This Skill
Run a quick estimate before issuing any of these operations on a table whose indexes are large (multiple GB per index, or many ranges per index):
CREATE INDEXADD COLUMNwithINDEXorUNIQUEALTER PRIMARY KEYCREATE MATERIALIZED VIEWCREATE TABLE ASREFRESH MATERIALIZED VIEWALTER TABLE ... SET LOCALITY(when the locality change rewrites data)
Tables whose indexes are small (kilobytes to a few megabytes) carry trivial storage risk; estimation is unnecessary.
Background
How much temporary space does a backfill actually need?
The honest answer depends on the operation:
CREATE INDEX/ADD COLUMN ... UNIQUE: needs roughly 1× the size of the new index — the indexed columns plus the primary key columns, written into a fresh index span. This is typically a small fraction of the table. Worst-case headroom is bounded by the size of that one index.ALTER PRIMARY KEY: rewrites the primary index and any secondary indexes whose definitions depend on the old PK. Old data sticks around until GC, so peak on-disk usage during the change can approach the size of the table again.- All bulk-ingest backfills: extra MVCC versions and pre-compaction SSTs add overhead until Pebble compacts and GC runs.
The official docs round these up into a single conservative recommendation:
plan for up to 3× the size of the affected table or index as free space.
That figure is a safety bound, not a precise prediction. For most
CREATE INDEX operations the real cost is much smaller; for
ALTER PRIMARY KEY on a large table it is the right ballpark.
What happens if the cluster runs out of disk mid-backfill?
Backfills bulk-ingest data via AddSSTable, which checks the per-store
remaining capacity before each ingestion. If the remaining fraction falls
below kv.bulk_io_write.min_capacity_remaining_fraction (default 0.05,
i.e. 5%), the ingest is rejected with InsufficientSpaceError. Both the
legacy and declarative schema changers translate that error into a job pause
request, so the schema change halts rather than wedging the cluster. To
resume, free space (e.g. drop unused indexes, expand storage) and resume the
paused job.
This is a reactive safety net, not a planning tool — by the time it fires, foreground writes on the affected store may already be unhealthy.
Estimating Capacity
Step 1 — Check free space per store
The minimum free space across stores is what bounds the schema change, not the total cluster free space (replicas are spread across nodes).
SELECT
node_id,
store_id,
ROUND((capacity - used) / 1073741824.0, 2) AS free_gb,
ROUND((used::FLOAT / capacity) * 100, 2) AS used_pct
FROM crdb_internal.kv_store_status
ORDER BY free_gb ASC;
Step 2 — Estimate the affected table/index size
Use the docs-recommended form of SHOW RANGES:
SHOW RANGES FROM TABLE <table> WITH DETAILS, KEYS, INDEXES;
The output includes one row per range, with range_size_mb and index_name.
Aggregate by index for the per-index totals that matter for capacity planning:
WITH r AS (SHOW RANGES FROM TABLE <table> WITH DETAILS, KEYS, INDEXES)
SELECT
index_name,
COUNT(*) AS range_count,
ROUND(SUM(range_size_mb), 2) AS index_size_mb,
ROUND(SUM(range_size_mb) / 1024, 2) AS index_size_gb
FROM r
GROUP BY index_name
ORDER BY index_size_mb DESC;
Step 3 — Compare against the operation
| Operation | Conservative free-space target (per store) |
|---|---|
CREATE INDEX / ADD COLUMN ... UNIQUE |
Up to 3× the size of the new index (its indexed + PK columns). |
ALTER PRIMARY KEY |
Up to 3× the size of the table (sum of the relevant indexes from step 2). |
CREATE MATERIALIZED VIEW / CREATE TABLE AS |
Up to 3× the expected size of the materialized result. |
The new index does not exist yet, so estimate it from a comparable existing index (e.g. one on similarly typed columns) or from the source columns' contribution to the primary index.
If the smallest free-space figure from step 1 is well above the target, the operation is safe to run. If it is close, free space first (drop unused indexes, expand storage) before issuing the DDL.
Operational Notes
SHOW RANGES ... WITH DETAILSis expensive. It computes span statistics on demand. Always target a specific table, never run it cluster-wide, and prefer maintenance windows on tables with thousands of ranges.- Watch the job, not just disk. If a backfill pauses with
InsufficientSpaceError, free disk on the affected store and resume the paused schema change job. Check with:SELECT job_id, status, error FROM crdb_internal.jobs WHERE job_type = 'SCHEMA CHANGE' AND status = 'paused'; - Drop unused indexes first. Often the cheapest way to free headroom
before a large backfill is to drop indexes that
crdb_internal.index_usage_statisticsshows are unused. - Statistics lag.
range_size_mbis approximate and can lag actual disk usage; treat estimates as conservative ballparks, not exact figures.
References
Related Skills
- analyzing-range-distribution — range count, leaseholder placement, fragmentation
More from cockroachlabs/cockroachdb-skills
cockroachdb-sql
Use when writing, generating, or optimizing SQL for CockroachDB, designing CockroachDB schemas, or when the user asks about CockroachDB-specific SQL patterns, type mappings, and distributed database best practices. Also use when encountering CockroachDB anti-patterns like missing primary keys, sequential ID hotspots, or incorrect type usage.
33analyzing-range-distribution
Analyzes CockroachDB range distribution across tables and indexes using SHOW RANGES to identify range count, size patterns, leaseholder placement, and replication health. Use when investigating hotspots, uneven data distribution, range fragmentation, or validating zone configuration effects without DB Console access.
29managing-cluster-settings
Reviews, audits, and modifies CockroachDB cluster settings. Self-Hosted has full control over all settings and start flags. Advanced/BYOC can modify most SQL-level settings but infrastructure settings are managed by CRL. Standard has limited settings access — session variables are the primary tuning mechanism. Basic has minimal settings — use session variables and Cloud Console. Use when auditing configuration, tuning performance, or troubleshooting settings-related issues.
27hardening-user-privileges
Hardens CockroachDB user privileges by auditing and tightening role-based access control, reducing admin grants, restricting PUBLIC role permissions, and applying least-privilege principles. Use when reducing excessive privileges, cleaning up admin access, or implementing RBAC best practices.
27auditing-table-statistics
Audits optimizer table statistics for staleness, missing coverage, and data quality issues using SHOW STATISTICS. Use when diagnosing poor query performance, unexpected plan changes, or after bulk data changes to identify stale statistics requiring refresh via CREATE STATISTICS.
27monitoring-background-jobs
Monitors CockroachDB background job health by identifying failed, paused, and long-running jobs using SHOW JOBS and SHOW AUTOMATIC JOBS. Surfaces schema changes, backups/restores, automatic statistics collection, and SQL stats compaction jobs without DB Console access. Use when investigating schema change delays, failed backups, or automatic job issues.
26