molt-replicator
molt replicator
Continuous change-data-capture (CDC) replication from source databases to CockroachDB. Run after molt fetch completes the initial bulk load.
Important: replicator is a separate binary from
molt. It is not invoked bymolt fetch. Thedata-load-and-replicationmode in molt fetch is deprecated — use replicator directly instead.
Architecture
Source DB ──► [replicator] ──► Staging DB (_replicator schema) ──► Target CockroachDB
▲
Publication/
Slot/BinLog/
LogMiner
Replicator reads changes from the source, buffers them in a staging schema on the target CRDB cluster, and applies them to the target tables.
Subcommands by Source
| Source | Command |
|---|---|
| PostgreSQL | replicator pglogical |
| MySQL | replicator mylogical |
| Oracle | replicator oraclelogminer |
| Kafka | replicator kafka |
| Cloud storage | replicator objstore |
| CockroachDB CDC | replicator start |
Full Fetch → Replicator Workflow
Step 1: Initial bulk load with molt fetch
molt fetch \
--source "postgresql://user:pass@source:5432/db" \
--target "postgresql://root@crdb:26257/db" \
--bucket-path "s3://mybucket/migration" \
--table-handling drop-on-target-and-recreate
Step 2: Create publication on source (PostgreSQL)
-- Run on source PostgreSQL:
CREATE PUBLICATION molt_fetch FOR ALL TABLES;
-- (molt fetch may have already created this; check first)
Step 3: Create staging database on target
-- Run on target CockroachDB:
CREATE DATABASE _replicator;
Step 4: Test connectivity
replicator preflight \
--sourceConn "postgresql://user:pass@source:5432/db" \
--targetConn "postgresql://root@crdb:26257/db"
Step 5: Start replicator
replicator pglogical \
--publicationName "molt_fetch" \
--sourceConn "postgresql://user:pass@source:5432/db" \
--stagingConn "postgresql://root@crdb:26257/_replicator" \
--stagingSchema "_replicator.public" \
--targetConn "postgresql://root@crdb:26257/db" \
--targetSchema "public" \
--metricsAddr "0.0.0.0:8080"
Step 6: Monitor lag
curl http://localhost:8080/metrics | grep replicator_
# Watch for: mutations applied, unapplied mutations, lag
Step 7: Cutover
- When lag reaches ~0, redirect app writes to CockroachDB
- Let replicator drain remaining changes
- Confirm no new writes on source
- Stop replicator
- Decommission source
Source-Specific Setup
PostgreSQL (pglogical)
Source prerequisites:
- User with
REPLICATIONprivilege - Logical replication enabled (
wal_level = logical) - Publication exists (created by
molt fetchor manually)
replicator pglogical \
--publicationName "molt_fetch" \
--slotName "replicator" \
--sourceConn "postgresql://..." \
--stagingConn "postgresql://root@crdb:26257/_replicator" \
--stagingSchema "_replicator.public" \
--targetConn "postgresql://root@crdb:26257/db" \
--targetSchema "public"
MySQL (mylogical)
Source prerequisites:
- Binary logging enabled (
binlog_format = ROW) - GTID mode on (
gtid_mode=ON,enforce_gtid_consistency=ON) - User with
REPLICATION CLIENTprivilege
replicator mylogical \
--sourceConn "mysql://root:pass@source:3306/db" \
--stagingConn "postgresql://root@crdb:26257/_replicator" \
--stagingSchema "_replicator.public" \
--targetConn "postgresql://root@crdb:26257/db" \
--targetSchema "public"
Oracle (oraclelogminer)
Source prerequisites:
- Archive log mode enabled
- Supplemental logging enabled
- LogMiner permissions granted
replicator oraclelogminer \
--sourceConn "oracle://app_user:pass@oracle:1521/db" \
--stagingConn "postgresql://root@crdb:26257/_replicator" \
--stagingSchema "_replicator.public" \
--targetConn "postgresql://root@crdb:26257/db" \
--targetSchema "public"
Key Flags
# Performance
--parallelism 16 # concurrent DB transactions (default: 16)
--flushSize 1000 # rows per batch (default: 1000)
--flushPeriod 1s # flush interval (default: 1s)
# Staging connection pool
--stagingMaxPoolSize 128
--stagingIdleTime 1m
--stagingMaxLifetime 5m
# Target connection pool
--targetMaxPoolSize 128
--targetStatementCacheSize 128
# Retry
--maxRetries 10
--retryInitialBackoff 25ms
--retryMaxBackoff 2s
# Monitoring
--metricsAddr "0.0.0.0:8080" # Prometheus metrics endpoint
--schemaRefresh 1m # refresh schema cache (0 = disabled)
# Dead letter queue (failed rows instead of stopping)
--dlqTableName "replicator_dlq"
# Logging
-v # debug
-vv # trace
--logFormat fluent # for log aggregators
--logDestination "/var/log/replicator.log"
Gotchas
- Staging schema (
_replicator.public) is auto-created by replicator, but the database (_replicator) must exist first --publicationNameand--slotNamemust match whatmolt fetchcreated (default:molt_fetch/molt_slot)- DLQ table grows over time — monitor and purge failed rows periodically
- Replicator holds an open replication slot on the source — this blocks WAL cleanup; monitor source disk usage
- Graceful shutdown respects
--gracePeriod(default: 30s); don't SIGKILL without it - No built-in alerting — set up external alerts on the Prometheus metrics endpoint
- Long cutover windows increase replication lag — plan for a maintenance window if needed
See flags reference for the full flag list.
More from cockroachlabs/cockroachdb-skills
cockroachdb-sql
Use when writing, generating, or optimizing SQL for CockroachDB, designing CockroachDB schemas, or when the user asks about CockroachDB-specific SQL patterns, type mappings, and distributed database best practices. Also use when encountering CockroachDB anti-patterns like missing primary keys, sequential ID hotspots, or incorrect type usage.
31analyzing-range-distribution
Analyzes CockroachDB range distribution across tables and indexes using SHOW RANGES to identify range count, size patterns, leaseholder placement, and replication health. Use when investigating hotspots, uneven data distribution, range fragmentation, or validating zone configuration effects without DB Console access.
27managing-cluster-settings
Reviews, audits, and modifies CockroachDB cluster settings. Self-Hosted has full control over all settings and start flags. Advanced/BYOC can modify most SQL-level settings but infrastructure settings are managed by CRL. Standard has limited settings access — session variables are the primary tuning mechanism. Basic has minimal settings — use session variables and Cloud Console. Use when auditing configuration, tuning performance, or troubleshooting settings-related issues.
25hardening-user-privileges
Hardens CockroachDB user privileges by auditing and tightening role-based access control, reducing admin grants, restricting PUBLIC role permissions, and applying least-privilege principles. Use when reducing excessive privileges, cleaning up admin access, or implementing RBAC best practices.
25auditing-table-statistics
Audits optimizer table statistics for staleness, missing coverage, and data quality issues using SHOW STATISTICS. Use when diagnosing poor query performance, unexpected plan changes, or after bulk data changes to identify stale statistics requiring refresh via CREATE STATISTICS.
25monitoring-background-jobs
Monitors CockroachDB background job health by identifying failed, paused, and long-running jobs using SHOW JOBS and SHOW AUTOMATIC JOBS. Surfaces schema changes, backups/restores, automatic statistics collection, and SQL stats compaction jobs without DB Console access. Use when investigating schema change delays, failed backups, or automatic job issues.
24