rate-limit-strategist
Rate Limit Strategist Protocol
This skill designs the throttling and quota mechanisms that protect an API from noisy neighbors, accidental infinite loops in client code, and malicious abuse. It shifts the focus from "how to code it" to "what the limits should actually be."
Core assumption: Without rate limits, your API will eventually be DDOSed by your own front-end bug.
1. Algorithm Selection (Static)
Select the right rate-limiting algorithm based on traffic characteristics:
- Token Bucket / Leaky Bucket: Best for general APIs. Allows small bursts of traffic (e.g., a burst of 10 requests) but smooths out average flow.
- Fixed Window: Simple to implement (e.g., reset at the top of the minute), but vulnerable to edge spikes (submitting 100 requests at 00:59 and 100 at 01:00).
- Sliding Window Log/Counter: More accurate, prevents edge spikes. Best for strict, paid-tier APIs.
2. Granularity & Dimensions
Rate limits should rarely be global. Define multiple layers:
- Layer 1: Global/IP (Infrastructure): Prevent DDOS (e.g., 500 req/sec per IP at Cloudflare/WAF).
- Layer 2: User Level (Application): Prevent noisy neighbors (e.g., 100 req/min for User A, 1000 req/min for Enterprise User B).
- Layer 3: Endpoint Level (Business Logic): Highly restrictive on expensive endpoints (e.g.,
/export-pdflimited to 1 req/min).
3. Response Standardization
When a limit is hit, the application must respond gracefully, not just fail. Define standard headers to inform the client.
4. Output Generation
Required Outputs (Must write BOTH to docs/api-report/):
- Human-Readable Markdown (
docs/api-report/rate-limit-report.md)
### 🛑 Rate Limiting Strategy
- **Selected Algorithm:** Token Bucket
- **Implementation Layer:** Redis-backed API Gateway Plugin.
#### ⚖️ Configured Quotas
1. **Global (IP-Based):** 300 requests per minute.
2. **Standard User (Token-Based):** 60 requests per minute.
3. **Expensive Route (`POST /generate-report`):** 5 requests per hour per User.
#### 📬 Consumer Response Design
When limits are exceeded, return `429 Too Many Requests`.
**Headers Included:**
- `X-RateLimit-Limit: 60` (Total quota)
- `X-RateLimit-Remaining: 0` (Used up)
- `X-RateLimit-Reset: 1711281600` (Unix timestamp of reset)
- `Retry-After: 45` (Seconds to wait)
**Body:**
```json
{
"error": "quota_exceeded",
"message": "You have exceeded your plan limit of 60 req/min. Please try again in 45 seconds.",
"upgrade_url": "https://dashboard.com/billing"
}
2. **Machine-Readable JSON (`docs/api-report/rate-limit-output.json`)**
```json
{
"skill": "rate-limit-strategist",
"algorithm": "token_bucket",
"tiers": [
{"type": "IP", "limit": 300, "window": "1m"},
{"type": "User", "limit": 60, "window": "1m"},
{"type": "Endpoint", "path": "/generate-report", "limit": 5, "window": "1h"}
],
"enforced_headers": ["Retry-After", "X-RateLimit-Remaining"]
}
Guardrails
- Header Standardization: Remind the user that different gateways use different headers (e.g.,
X-RateLimitvs standard IETFRateLimit). Pick one and be consistent. - Distributed State: Point out that local in-memory rate limiting fails in horizontally scaled environments. Redis, Memcached, or native Gateway limits are required.
More from fatih-developer/fth-skills
multi-brain-debate
Two-round debate protocol where perspectives challenge each other before consensus. Round 1 presents independent positions, Round 2 allows counter-arguments and rebuttals. Produces battle-tested decisions for high-stakes choices.
17multi-brain
Evaluate complex requests from 3 independent perspectives (Creative, Pragmatic, Comprehensive), reach consensus, then produce complete outputs. Use for architecture decisions, creative content, analysis, and any task where multiple valid approaches exist.
13error-recovery
When a step fails during an agentic task, classify the error (transient, configuration, logic, or permanent), apply the right recovery strategy, and escalate to the user when all strategies are exhausted. Triggers on error messages, exceptions, tracebacks, 'failed', 'not working', 'retry', or when 2 consecutive steps fail.
12react-flow
Analyze, repair, migrate, and scaffold @xyflow/react codebases. Use when users ask to debug React Flow behavior, fix node/edge state wiring, improve type safety or performance, upgrade legacy React Flow APIs, preserve persisted graph compatibility, or generate a complete React Flow starter from scratch.
12multi-brain-experts
Replace generic perspectives with domain-specific expert roles selected dynamically per request. Automatically picks the 3 most relevant experts from a role pool (Security, Performance, UX, Cost, DX, Architecture, etc.) based on the task context.
12assumption-checker
Before starting a task or taking a critical step, surface and verify the assumptions the agent is making. Checks 4 types - technical (libraries, APIs), data (files, formats), business logic (rules, scope), and user intent (what the user actually wants). Triggers on ambiguous requests, multi-step tasks, or whenever 'are you sure', 'check first', 'don't assume' appears.
11