api-design-ops
SKILL.md
API Design Ops
Comprehensive API design patterns covering REST (advanced), gRPC, and GraphQL. This skill provides decision frameworks, design patterns, and implementation guidance for building production APIs.
API Style Decision Tree
What kind of API do you need?
|
+-- Internal microservice-to-microservice?
| +-- High throughput, low latency needed? --> gRPC
| +-- Streaming (real-time data, logs)? --> gRPC (bidirectional streaming)
| +-- Simple request/response, team comfort? --> REST
|
+-- Public-facing API?
| +-- Third-party developers consuming it? --> REST (widest compatibility)
| +-- Mobile app with varied data needs? --> GraphQL
| +-- Browser-only, simple CRUD? --> REST
|
+-- Frontend for your own app?
| +-- Multiple clients with different data shapes? --> GraphQL
| +-- Single client, straightforward data? --> REST
| +-- Real-time updates needed? --> GraphQL subscriptions or SSE
|
+-- IoT / embedded / constrained devices?
| +-- Binary efficiency matters? --> gRPC
| +-- HTTP-only environments? --> REST
Quick Comparison
| Concern | REST | gRPC | GraphQL |
|---|---|---|---|
| Transport | HTTP/1.1+ | HTTP/2 | HTTP (any) |
| Serialization | JSON (text) | Protobuf (binary) | JSON (text) |
| Schema | OpenAPI (optional) | .proto (required) | SDL (required) |
| Browser support | Native | Via gRPC-Web/Connect | Native |
| Caching | HTTP caching built-in | Custom | Custom (normalized) |
| Learning curve | Low | Medium | Medium-High |
| Code generation | Optional | Required | Optional but recommended |
| Streaming | SSE, WebSocket | Native (4 patterns) | Subscriptions |
| Over-fetching | Common problem | No (typed) | Solved by design |
| File uploads | Multipart native | Chunked streaming | Multipart spec (awkward) |
REST Resource Design Quick Reference
Resource Naming
GET /users # Collection
GET /users/{id} # Singleton
GET /users/{id}/orders # Sub-collection
POST /users # Create
PUT /users/{id} # Full replace
PATCH /users/{id} # Partial update
DELETE /users/{id} # Remove
# Naming rules:
# - Plural nouns for collections: /users NOT /user
# - Kebab-case for multi-word: /line-items NOT /lineItems
# - No verbs in URLs: POST /orders NOT POST /create-order
# - Max 3 levels deep: /users/{id}/orders (not /users/{id}/orders/{oid}/items/{iid}/details)
HTTP Methods and Status Codes
| Method | Success | Empty | Invalid | Not Found | Conflict |
|---|---|---|---|---|---|
| GET | 200 | 200 (empty array) | 400 | 404 | - |
| POST | 201 + Location | - | 400/422 | - | 409 |
| PUT | 200 | - | 400/422 | 404 | 409 |
| PATCH | 200 | - | 400/422 | 404 | 409 |
| DELETE | 204 | 204 (already gone) | 400 | 404 | 409 |
HATEOAS (When Worth It)
Use when: public APIs where discoverability matters, long-lived APIs, APIs that evolve frequently. Skip when: internal microservices, mobile backends, tight coupling is acceptable.
{
"id": "order-123",
"status": "shipped",
"_links": {
"self": { "href": "/orders/order-123" },
"track": { "href": "/orders/order-123/tracking" },
"cancel": { "href": "/orders/order-123", "method": "DELETE" }
}
}
Pagination Decision Tree
What's your data like?
|
+-- Stable data, UI needs "jump to page 5"?
| --> Offset pagination: ?page=5&per_page=20
| Tradeoff: Slow on large offsets (OFFSET 10000), inconsistent with inserts
|
+-- Large dataset, forward-only traversal?
| --> Cursor pagination: ?after=eyJpZCI6MTIzfQ&limit=20
| Tradeoff: No random page access, but consistent and fast
|
+-- Real-time feed, ordered by timestamp or ID?
| --> Keyset pagination: ?created_after=2024-01-01T00:00:00Z&limit=20
| Tradeoff: Requires a unique, sequential column; no page jumping
Response Envelope
{
"data": [...],
"pagination": {
"total": 1432,
"limit": 20,
"has_more": true,
"next_cursor": "eyJpZCI6MTQzMn0="
}
}
Error Response Format (RFC 7807)
All APIs should use Problem Details (RFC 7807 / RFC 9457):
{
"type": "https://api.example.com/errors/insufficient-funds",
"title": "Insufficient Funds",
"status": 422,
"detail": "Account xxxx-1234 has a balance of $10.00, but the transfer requires $25.00.",
"instance": "/transfers/txn-abc-123",
"balance": 1000,
"required": 2500
}
Field Reference
| Field | Required | Description |
|---|---|---|
type |
Yes | URI identifying the error type (stable, documentable) |
title |
Yes | Human-readable summary (same for all instances of this type) |
status |
Yes | HTTP status code |
detail |
Yes | Human-readable explanation specific to this occurrence |
instance |
No | URI identifying the specific occurrence |
| (extensions) | No | Additional machine-readable fields |
Validation Errors
{
"type": "https://api.example.com/errors/validation",
"title": "Validation Failed",
"status": 422,
"detail": "The request body contains 2 validation errors.",
"errors": [
{ "field": "email", "message": "Must be a valid email address", "code": "invalid_format" },
{ "field": "age", "message": "Must be at least 18", "code": "out_of_range", "min": 18 }
]
}
Versioning Strategies
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| URL path | /v2/users |
Obvious, cacheable, easy routing | URL pollution, hard to sunset |
| Accept header | Accept: application/vnd.api.v2+json |
Clean URLs, content negotiation | Hidden, harder to test |
| Query param | /users?version=2 |
Easy to add | Pollutes query string, caching issues |
| Date-based | API-Version: 2024-01-15 |
Granular evolution (Stripe style) | Complex implementation |
Recommendation
- Public APIs: URL path versioning (
/v1/) - simplicity wins - Internal APIs: Header or no versioning (deploy in lockstep)
- Evolving APIs: Date-based (Stripe model) if you have the engineering investment
Breaking Change Rules
A breaking change is anything that can cause existing clients to fail:
- Removing a field from a response
- Renaming a field
- Changing a field's type
- Adding a required field to a request
- Changing URL structure
- Changing error formats
- Removing an endpoint
Non-breaking (safe):
- Adding optional fields to requests
- Adding fields to responses
- Adding new endpoints
- Adding new enum values (if client handles unknown values)
Rate Limiting Design
Algorithms
| Algorithm | Behavior | Use When |
|---|---|---|
| Token bucket | Allows bursts, refills at steady rate | General API rate limiting |
| Sliding window | Smooth distribution, no burst | Strict fairness needed |
| Fixed window | Simple, potential burst at boundary | Low-stakes limiting |
| Leaky bucket | Constant output rate | Queue processing |
Response Headers
X-RateLimit-Limit: 1000 # Max requests per window
X-RateLimit-Remaining: 743 # Requests left in current window
X-RateLimit-Reset: 1672531200 # Unix timestamp when window resets
Retry-After: 30 # Seconds to wait (on 429)
429 Response Body
{
"type": "https://api.example.com/errors/rate-limit-exceeded",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "You have exceeded 1000 requests per hour. Try again in 30 seconds.",
"retry_after": 30
}
Idempotency
Which Methods Need Idempotency Keys?
| Method | Idempotent by spec? | Needs key? |
|---|---|---|
| GET | Yes | No |
| PUT | Yes | No (full replacement is naturally idempotent) |
| DELETE | Yes | No |
| PATCH | No | Recommended for critical operations |
| POST | No | Yes (always for payments, orders, transfers) |
Implementation
POST /payments
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Content-Type: application/json
{ "amount": 2500, "currency": "usd", "customer": "cust_123" }
Server-side:
- Receive request with
Idempotency-Keyheader - Check if key exists in store (Redis, DB)
- If exists: return stored response (same status code + body)
- If not: process request, store response keyed by idempotency key
- Keys expire after 24-48 hours
Authentication Overview
| Method | Use When | Security Level |
|---|---|---|
| API Key | Server-to-server, internal, simple | Low-Medium |
| JWT (Bearer) | Stateless auth, microservices | Medium-High |
| OAuth2 + PKCE | Third-party access, user delegation | High |
| mTLS | Service mesh, zero-trust infra | Very High |
Decision Guide
Who is authenticating?
|
+-- Your own frontend? --> JWT (short-lived access + refresh token)
+-- Third-party developer? --> OAuth2 (client credentials for server, PKCE for SPA)
+-- Another internal service? --> mTLS or JWT with service accounts
+-- Quick prototype? --> API key (but plan migration)
Gotchas Table
| Gotcha | Problem | Prevention |
|---|---|---|
| Breaking changes in "non-breaking" release | Client crashes | Additive-only policy, contract tests |
| N+1 in REST APIs | 100 users = 101 queries | Compound documents, ?include=, or GraphQL |
| Over-fetching | Mobile gets 50 fields, needs 3 | Sparse fieldsets ?fields=id,name or GraphQL |
| Under-fetching | 3 requests to build one view | Composite endpoints or BFF pattern |
| CORS misconfiguration | Frontend can't reach API | Explicit allowed origins, never * with credentials |
| Missing Content-Type | 415 or silent parsing failure | Validate Content-Type on every mutation endpoint |
| Large payloads without pagination | OOM, timeouts | Always paginate collections, set max page size |
| Inconsistent date formats | Parsing hell | ISO 8601 everywhere: 2024-01-15T10:30:00Z |
| No request IDs | Impossible to debug | Generate X-Request-ID, propagate through services |
| Enum evolution | New value breaks old client | Document that enums may grow, clients must handle unknown |
| Missing idempotency | Duplicate charges, orders | Idempotency keys on all POST endpoints with side effects |
| Unbounded query complexity | GraphQL DoS | Depth limiting, cost analysis, persisted queries |
Reference Files
| File | Contents |
|---|---|
references/rest-advanced.md |
Resource modeling, PATCH strategies, caching, webhooks, bulk ops |
references/grpc.md |
Protobuf, service definitions, Go/Rust, streaming, error handling |
references/graphql.md |
Schema design, resolvers, DataLoader, federation, performance |
references/api-security.md |
JWT, OAuth2, CORS, rate limiting, OWASP API Top 10 |
Weekly Installs
2
Repository
0xdarkmatter/claude-modsGitHub Stars
9
First Seen
5 days ago
Security Audits
Installed on
amp2
cline2
openclaw2
opencode2
cursor2
kimi-cli2