tzurot-caching
Tzurot Caching & Horizontal Scaling Patterns
Overview
This skill covers caching patterns for Tzurot v3, with a focus on horizontal scaling readiness. Use this skill when:
- Adding a new cache
- Evaluating if a cache needs cross-instance invalidation
- Debugging cache-related issues
- Planning for horizontal scaling
Cache Decision Tree
When adding or modifying a cache, follow this decision tree:
Does staleness cause incorrect behavior?
├── YES → Redis + pub/sub invalidation
│ (e.g., channel activations - stale = missed messages)
│
└── NO → Is it expensive external API data?
├── YES → Must survive Redis restarts?
│ ├── YES → Two-tier (L1 Redis + L2 PostgreSQL)
│ │ (e.g., vision descriptions - expensive API, immutable)
│ │
│ └── NO → Redis with TTL
│ (e.g., OpenRouter model list - shared across instances)
│
└── NO → Is it read-heavy optimization?
├── YES → In-memory TTL only
│ (e.g., autocomplete - 60s staleness acceptable)
│
└── NO → Is it rate limiting?
├── YES → In-memory Map (local is correct)
│ (e.g., notification cooldowns)
│
└── NO → Probably don't need caching
Cache Types & When to Use Each
1. Redis + Pub/Sub Invalidation
Use when: Staleness causes correctness issues (wrong behavior, not just stale UX).
Pattern: Redis pub/sub broadcasts invalidation events to all instances.
Example: Channel activation cache
// Service definition (common-types)
export class ChannelActivationCacheInvalidationService extends BaseCacheInvalidationService<Event> {
constructor(redis: Redis) {
super(
redis,
REDIS_CHANNELS.CHANNEL_ACTIVATION_CACHE_INVALIDATION,
'ChannelActivationCacheInvalidation',
validator
);
}
async invalidateChannel(channelId: string): Promise<void> {
await this.publish({ type: 'channel', channelId });
}
}
// Subscriber (bot-client startup)
await invalidationService.subscribe(event => {
if (event.type === 'channel') {
invalidateChannelActivationCache(event.channelId);
}
});
// Publisher (after successful operation)
await invalidationService.invalidateChannel(channelId);
Current implementations:
CacheInvalidationService- Personality cacheChannelActivationCacheInvalidationService- Channel activation cache
2. Redis with TTL (No Invalidation)
Use when: Data is expensive to fetch, shared across instances, and TTL-based staleness is acceptable.
Pattern: Store in Redis with TTL, optionally layer in-memory cache on top.
Example: OpenRouter model list
// Redis as source of truth (24 hour TTL)
const cached = await redis.get('openrouter:models');
if (cached) return JSON.parse(cached);
const models = await fetchFromOpenRouter();
await redis.setex('openrouter:models', 86400, JSON.stringify(models));
return models;
Current implementations:
OpenRouterModelCache.ts- Model list (24h Redis TTL + 5min memory TTL)VisionDescriptionCache.ts- Image descriptions (1h Redis TTL)VoiceTranscriptCache.ts- Voice transcripts (5min Redis TTL)RedisDeduplicationCache.ts- Request dedup (5sec Redis TTL)
3. Two-Tier Cache (L1 Redis + L2 PostgreSQL)
Use when: Data is expensive to compute, shared across instances, and must survive Redis restarts.
Pattern: L1 (Redis with TTL) → L2 (PostgreSQL persistent) → API fallback → write to both tiers.
Example: Vision description cache (image → text descriptions)
Lookup Flow:
1. Check L1 (Redis) → cache HIT → return
2. Check L2 (PostgreSQL) → cache HIT → populate L1 → return
3. Call vision API → store in both L1 and L2 → return
// L1: VisionDescriptionCache (Redis, 1h TTL)
const l1Cache = new VisionDescriptionCache(redis);
// L2: PersistentVisionCache (PostgreSQL, no TTL)
const l2Cache = new PersistentVisionCache(prisma);
// Lookup pattern
async function getImageDescription(attachmentId: string): Promise<string> {
// L1 check
const l1Result = await l1Cache.get(attachmentId);
if (l1Result) return l1Result;
// L2 check
const l2Result = await l2Cache.get(attachmentId);
if (l2Result) {
// Populate L1 from L2
await l1Cache.set(attachmentId, l2Result.description, l2Result.model);
return l2Result.description;
}
// API fallback
const description = await callVisionAPI(imageUrl);
// Write to both tiers
await Promise.all([
l1Cache.set(attachmentId, description, model),
l2Cache.set({ attachmentId, description, model }),
]);
return description;
}
Key strategy: Uses Discord attachment snowflake IDs (stable) instead of ephemeral CDN URLs (expire after ~24h). This ensures cache hits even when the URL changes.
Current implementations:
| Tier | Service | Storage | TTL | Purpose |
|---|---|---|---|---|
| L1 | VisionDescriptionCache.ts |
Redis | 1 hour | Fast lookup, network-shared |
| L2 | PersistentVisionCache.ts |
PostgreSQL | Forever | Survives Redis restarts |
When to use this pattern:
- API calls are expensive ($$$) or rate-limited
- Data changes rarely or never (image descriptions are immutable)
- Redis may restart (Railway deployments)
- Historical data has long-tail access patterns
4. In-Memory TTL Cache
Use when: Read-heavy optimization where staleness is acceptable UX (not correctness) issue.
Pattern: Use TTLCache from common-types.
import { TTLCache } from '@tzurot/common-types';
const cache = new TTLCache<ResponseType>({
ttl: 60 * 1000, // 60 seconds
maxSize: 500, // Max entries
});
// Use
const cached = cache.get(key);
if (cached) return cached;
const fresh = await fetchData();
cache.set(key, fresh);
return fresh;
Current implementations:
autocompleteCache.ts- User autocomplete data (60s TTL, 500 users)channelActivationCachein GatewayClient - Now with pub/sub invalidationModelCapabilityChecker.ts- Vision capability flags (5min TTL)PersonalityService.tscache - Loaded personalities (5min TTL, pub/sub invalidation)
5. In-Memory Map (No TTL)
Use when: Rate limiting or cooldowns where local-per-instance is actually correct.
Pattern: Simple Map with periodic cleanup.
const cooldowns = new Map<string, number>();
function isOnCooldown(key: string): boolean {
const lastTime = cooldowns.get(key);
if (!lastTime) return false;
return Date.now() - lastTime < COOLDOWN_MS;
}
function setCooldown(key: string): void {
cooldowns.set(key, Date.now());
}
// Cleanup old entries periodically (see tzurot-async-flow for timer concerns)
Current implementations:
notificationCache.ts- User notification timestamps (1h cooldown)
Horizontal Scaling Concerns
What DOES Break with Multiple Instances
- In-memory caches without invalidation - Each instance has different data
- Timer-based cleanup (
setInterval) - Each instance runs its own timers - In-memory state - Sessions, connection pools, etc.
What DOESN'T Break
- Redis-backed caches - Shared state across instances
- Request-scoped memory - Data lives only during request
- Stateless services - No instance-specific state
Cache Audit Summary
| Cache | Location | TTL | Scaling Risk | Status |
|---|---|---|---|---|
| Channel Activation | GatewayClient.ts |
30s | ✅ Pub/sub added | |
| Autocomplete | autocompleteCache.ts |
60s | Minor | Acceptable |
| Notification | notificationCache.ts |
1 hour | Minor | Local is correct |
| Global Config | preset/autocomplete.ts |
60s | None | Single entry |
| Personality | PersonalityService.ts |
5 min | None | Has pub/sub |
| Model Capability | ModelCapabilityChecker.ts |
5 min | None | Reads from Redis |
| OpenRouter Models | OpenRouterModelCache.ts |
24h Redis | None | Redis is truth |
| Vision Description | VisionDescriptionCache.ts |
1 hour | None | L1/L2 two-tier |
| Voice Transcript | VoiceTranscriptCache.ts |
5 min | None | Redis-backed |
| Request Dedup | RedisDeduplicationCache.ts |
5 sec | None | Redis-backed |
Full audit: docs/architecture/CACHING_AUDIT.md
Creating a New Cache with Pub/Sub Invalidation
For cross-instance cache invalidation, follow the 6-step guide:
- Add Redis channel in
common-types/constants/queue.ts - Create invalidation service extending
BaseCacheInvalidationService - Export from common-types index
- Register in service registry (bot-client)
- Subscribe on startup
- Publish on changes
📚 See: docs/reference/caching/PUBSUB_INVALIDATION_GUIDE.md for complete implementation with code examples.
Existing implementations to reference:
CacheInvalidationService- Personality cacheChannelActivationCacheInvalidationService- Channel activation cache
TTLCache Usage
The TTLCache class from common-types is the standard for in-memory caching:
import { TTLCache } from '@tzurot/common-types';
// Create cache
const cache = new TTLCache<ValueType>({
ttl: 60 * 1000, // TTL in milliseconds
maxSize: 100, // Maximum entries (LRU eviction)
});
// Operations
cache.set('key', value); // Add/update entry
const value = cache.get('key'); // Get entry (undefined if expired/missing)
cache.delete('key'); // Remove specific entry
cache.clear(); // Remove all entries
const has = cache.has('key'); // Check if exists (and not expired)
Related Documentation
docs/architecture/CACHING_AUDIT.md- Full cache inventory and analysistzurot-async-flowskill - Timer patterns and BullMQ alternativestzurot-architectureskill - Service boundaries and data flowtzurot-db-vectorskill - Database caching considerations
Related Skills
- tzurot-async-flow - Timer patterns, BullMQ for scheduled cleanup
- tzurot-architecture - Where caches belong in service boundaries
- tzurot-shared-types - Type definitions for cache events
- tzurot-observability - Logging cache operations