Kafka Development Practices Skill

All topic names config values (Typesafe Config or pure-config).
Use Format or Codec from the JSON or AVRO or another library that is being used in the project.
Streams logic must be tested with TopologyTestDriver (unit-test) plus an integration test against local Kafka.

Iron Laws

ALWAYS set explicit acks=all and min.insync.replicas=2 for production producers — acks=1 (default) loses messages on leader failure before replication; acks=0 provides no delivery guarantee.
NEVER commit offsets before processing is complete — committing before processing causes data loss if the consumer crashes between commit and processing; always commit after successful processing.
ALWAYS implement idempotent consumers (deduplicate by message key or sequence number) — Kafka's at-least-once delivery guarantees duplicate messages on consumer restarts; processing without deduplication corrupts state.
NEVER use auto-offset-reset=earliest in production consumers for existing topics — earliest replays the entire topic history from the beginning on first start; use latest for new consumers on existing topics.
ALWAYS set max.poll.interval.ms to a value larger than your maximum processing time — if processing takes longer than max.poll.interval.ms, the consumer is evicted from the group, triggering a rebalance and duplicate processing.

Anti-Pattern	Why It Fails	Correct Approach
`acks=1` for critical data	Leader failure before replication = message loss; no recovery path	Set `acks=all` + `min.insync.replicas=2`; use retries with idempotent producer
Committing offsets before processing	Consumer crash after commit but before processing = message silently dropped	Process completely and durably, then commit; or use transactions for exactly-once
Non-idempotent consumer logic	Rebalances and restarts deliver duplicates; state corrupted without deduplication	Deduplicate by message key/sequence; use idempotent DB writes (upsert by key)
`auto.offset.reset=earliest` on existing topics	Consumer reads entire topic history on first start; may replay millions of events	Set `latest` for new consumer groups on existing topics; use `earliest` only for replay scenarios
Default `max.poll.interval.ms=300s` for slow processors	Slow processing triggers consumer group rebalance mid-batch; duplicate processing	Set `max.poll.interval.ms` > worst-case processing time; reduce batch size if needed

Before starting:

cat .claude/context/memory/learnings.md

After completing: Record any new patterns or exceptions discovered.

ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.