kafka-development-practices
SKILL.md
Kafka Development Practices Skill
- All topic names config values (Typesafe Config or pure-config).
- Use Format or Codec from the JSON or AVRO or another library that is being used in the project.
- Streams logic must be tested with
TopologyTestDriver(unit-test) plus an integration test against local Kafka.
Iron Laws
- ALWAYS set explicit
acks=allandmin.insync.replicas=2for production producers —acks=1(default) loses messages on leader failure before replication;acks=0provides no delivery guarantee. - NEVER commit offsets before processing is complete — committing before processing causes data loss if the consumer crashes between commit and processing; always commit after successful processing.
- ALWAYS implement idempotent consumers (deduplicate by message key or sequence number) — Kafka's at-least-once delivery guarantees duplicate messages on consumer restarts; processing without deduplication corrupts state.
- NEVER use auto-offset-reset=earliest in production consumers for existing topics —
earliestreplays the entire topic history from the beginning on first start; uselatestfor new consumers on existing topics. - ALWAYS set
max.poll.interval.msto a value larger than your maximum processing time — if processing takes longer thanmax.poll.interval.ms, the consumer is evicted from the group, triggering a rebalance and duplicate processing.
Anti-Patterns
| Anti-Pattern | Why It Fails | Correct Approach |
|---|---|---|
acks=1 for critical data |
Leader failure before replication = message loss; no recovery path | Set acks=all + min.insync.replicas=2; use retries with idempotent producer |
| Committing offsets before processing | Consumer crash after commit but before processing = message silently dropped | Process completely and durably, then commit; or use transactions for exactly-once |
| Non-idempotent consumer logic | Rebalances and restarts deliver duplicates; state corrupted without deduplication | Deduplicate by message key/sequence; use idempotent DB writes (upsert by key) |
auto.offset.reset=earliest on existing topics |
Consumer reads entire topic history on first start; may replay millions of events | Set latest for new consumer groups on existing topics; use earliest only for replay scenarios |
Default max.poll.interval.ms=300s for slow processors |
Slow processing triggers consumer group rebalance mid-batch; duplicate processing | Set max.poll.interval.ms > worst-case processing time; reduce batch size if needed |
Memory Protocol (MANDATORY)
Before starting:
cat .claude/context/memory/learnings.md
After completing: Record any new patterns or exceptions discovered.
ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.
Weekly Installs
21
Repository
oimiragieo/agent-studioGitHub Stars
16
First Seen
Feb 25, 2026
Security Audits
Installed on
github-copilot21
codex21
kimi-cli21
gemini-cli21
cursor21
opencode21