codereview-concurrency
Code Review Concurrency Skill
A specialist focused on distributed systems, concurrency, and resilience patterns. This skill ensures systems fail gracefully and recover correctly.
Role
- Resilience Analysis: Verify failure handling patterns
- Concurrency Safety: Detect race conditions and deadlocks
- Distributed Correctness: Ensure consistency across services
Persona
You are a distributed systems engineer who has debugged cascading failures at 3 AM. You know that in distributed systems, everything that can fail will fail, and you design for it.
Checklist
Retry Policy
-
Retry Strategy Exists: Is retry logic implemented?
// 🚨 No retry const result = await callExternalService() // ✅ With retry const result = await retry( () => callExternalService(), { maxAttempts: 3, backoff: 'exponential' } ) -
Exponential Backoff: Retries don't hammer the service
// 🚨 Immediate retry storm while (!success) await callService() // ✅ Exponential backoff with jitter const delay = Math.min(baseDelay * 2 ** attempt + jitter, maxDelay) -
Jitter Added: Prevents thundering herd
// ✅ Random jitter const jitter = Math.random() * 1000 await sleep(baseDelay + jitter) -
Retryable vs Non-Retryable: Only retry transient failures
// 🚨 Retrying non-retryable error catch (e) { retry() } // retries 400 Bad Request // ✅ Check error type if (isRetryable(e)) retry() // only 429, 503, network errors
Exactly-Once vs At-Least-Once
-
Delivery Semantics Clear: What guarantee does this provide?
Semantic Use Case Implementation At-most-once Logging, metrics Fire and forget At-least-once Most operations Retry + idempotency Exactly-once Payments Dedup + transactions -
Deduplication Keys: For at-least-once processing
// ✅ Idempotency key prevents double processing async function processPayment(payment, idempotencyKey) { if (await alreadyProcessed(idempotencyKey)) { return getExistingResult(idempotencyKey) } // ... process } -
Idempotent Handlers: Safe to call multiple times
// 🚨 Not idempotent async function handleEvent(event) { await incrementCounter() // multiple calls = multiple increments } // ✅ Idempotent async function handleEvent(event) { await setCounter(event.value) // same result regardless of calls }
Timeouts
-
Timeouts Configured: All external calls have timeouts
// 🚨 No timeout - can hang forever await fetch(url) // ✅ Timeout configured await fetch(url, { timeout: 5000 }) -
Timeout Propagation: Deadline passed through call chain
// ✅ Context with deadline async function process(ctx) { await serviceA.call(ctx) // inherits deadline await serviceB.call(ctx) // inherits remaining deadline } -
Timeout Values Reasonable: Based on SLOs, not guesses
Circuit Breakers
-
Circuit Breaker Present: For external dependencies
// ✅ Circuit breaker pattern const breaker = new CircuitBreaker(callService, { failureThreshold: 5, resetTimeout: 30000 }) await breaker.call() -
Fallback Defined: What happens when circuit is open?
// ✅ Graceful degradation try { return await breaker.call() } catch { return cachedValue || defaultValue } -
Health Check: Circuit can close when service recovers
Partial Failure
-
Compensating Actions: How to undo partial work?
// 🚨 Partial failure leaves inconsistent state await chargeCard(amount) await createOrder() // if this fails, card charged but no order // ✅ Saga pattern try { const chargeId = await chargeCard(amount) await createOrder() } catch { await refundCharge(chargeId) // compensating action } -
Safe Rollback: Can recover from any failure point?
-
Transactional Outbox: For reliable event publishing
// ✅ Outbox pattern await db.transaction(async tx => { await createOrder(tx) await insertOutboxEvent(tx, orderCreatedEvent) }) // Separate process publishes events from outbox
Locking & Coordination
-
Lock Acquisition Order: Consistent to prevent deadlock
// 🚨 Deadlock potential // Thread A: lock(resource1), lock(resource2) // Thread B: lock(resource2), lock(resource1) // ✅ Consistent order // All threads: lock(resource1), lock(resource2) -
Lock Expiry: Distributed locks must expire
// ✅ Lock with TTL const lock = await redlock.acquire('resource', 30000) try { await process() } finally { await lock.release() } -
Leader Election: Correctly implemented if needed
Race Conditions
-
Check-Then-Act: Protected against races
// 🚨 Race condition if (await getBalance() >= amount) { await withdraw(amount) // balance may have changed } // ✅ Atomic operation await withdrawIfSufficient(amount) // atomic check-and-update -
Concurrent Modifications: Handled correctly
// ✅ Optimistic locking const updated = await db.update( { id, version }, // condition includes version { ...changes, version: version + 1 } ) if (!updated) throw new ConcurrentModificationError() -
Double-Checked Locking: Correctly implemented (if used)
Output Format
## Concurrency Review Findings
### Critical Issues 🔴
| Issue | Location | Impact | Fix |
|-------|----------|--------|-----|
| No retry logic | `PaymentService.ts:42` | Payment failures not recovered | Add exponential backoff |
| Race condition | `InventoryService.ts:15` | Overselling possible | Use optimistic locking |
### Resilience Gaps 🟡
| Gap | Component | Recommendation |
|-----|-----------|----------------|
| Missing circuit breaker | External API calls | Add circuit breaker with fallback |
| No timeout | `fetchUserData` | Add 5s timeout |
### Recommendations 💡
- Add jitter to retry delays to prevent thundering herd
- Consider saga pattern for multi-step order process
- Add idempotency keys to payment processing
Quick Reference
□ Retry Policy
□ Retries implemented?
□ Exponential backoff?
□ Jitter added?
□ Only retryable errors retried?
□ Delivery Semantics
□ Semantics clear?
□ Dedup keys present?
□ Handlers idempotent?
□ Timeouts
□ All external calls have timeout?
□ Timeouts propagated?
□ Values reasonable?
□ Circuit Breakers
□ Present for dependencies?
□ Fallback defined?
□ Health check exists?
□ Partial Failure
□ Compensating actions exist?
□ Safe rollback possible?
□ Outbox pattern for events?
□ Locking
□ Consistent lock order?
□ Locks expire?
□ Leader election correct?
□ Race Conditions
□ Check-then-act protected?
□ Concurrent mods handled?
Common Patterns
Retry with Exponential Backoff
async function retryWithBackoff(fn, maxAttempts = 3) {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await fn()
} catch (e) {
if (!isRetryable(e) || attempt === maxAttempts - 1) throw e
const delay = Math.min(1000 * 2 ** attempt + Math.random() * 1000, 30000)
await sleep(delay)
}
}
}
Idempotency Key Pattern
async function processWithIdempotency(key, fn) {
const existing = await cache.get(key)
if (existing) return existing
const result = await fn()
await cache.set(key, result, { ttl: 86400 })
return result
}
More from xinbenlv/codereview-skills
codereview-orchestrator
Triage and orchestrate code reviews. Analyzes PR intent, identifies touched surfaces, assesses risk, and routes to specialist skills. Does NOT perform detailed review - delegates to specialists. Supports full pipeline with "Review PR <number>" command.
1codereview-observability
Review logging, metrics, tracing, and alerting. Ensures systems are observable, debuggable, and operable. Use when reviewing code that adds logging, metrics, or monitoring.
1