openrouter-api
OpenRouter API
OpenRouter is a unified API in front of 300+ LLMs. Schema is OpenAI-compatible at /api/v1/chat/completions plus OpenRouter-only extensions (models[], provider, plugins, reasoning, cache_control, presets, :suffix shortcuts). Auth is Authorization: Bearer <OPENROUTER_API_KEY>.
Base URL & auth
POST https://openrouter.ai/api/v1/chat/completions
Authorization: Bearer $OPENROUTER_API_KEY
Content-Type: application/json
Optional attribution headers (enable leaderboard ranking):
HTTP-Referer: <YOUR_SITE_URL>X-OpenRouter-Title: <YOUR_SITE_NAME>(X-Titlealso accepted)X-OpenRouter-Categories: <comma,separated>
API keys are created at https://openrouter.ai/keys. Each key supports a credit limit and works for OAuth flows. OpenRouter is a GitHub secret-scanning partner — leaked sk-or-... keys trigger email notification.
Three integration paths
| Path | Package | Best for |
|---|---|---|
| Raw HTTP | none | Any language, no deps |
| Client SDK | @openrouter/sdk (npm) / openrouter (pip) |
Type-safe thin wrapper over REST |
| Agent SDK | @openrouter/agent (npm only, TS) |
Multi-turn loops, tool execution, state via callModel |
| OpenAI SDK | openai with baseURL: https://openrouter.ai/api/v1 |
Drop-in for existing OpenAI code |
Use OpenRouter SDKs by default. Reference the OpenAI SDK only when user explicitly asks.
Quickstart (TS SDK)
import OpenRouter from '@openrouter/sdk';
const client = new OpenRouter({ apiKey: process.env.OPENROUTER_API_KEY });
const completion = await client.chat.send({
model: 'openai/gpt-5.2',
messages: [{ role: 'user', content: 'What is the meaning of life?' }],
});
console.log(completion.choices[0].message.content);
Quickstart (Python SDK)
from openrouter import OpenRouter
import os
with OpenRouter(api_key=os.getenv("OPENROUTER_API_KEY")) as client:
response = client.chat.send(
model="openai/gpt-5.2",
messages=[{"role": "user", "content": "What is the meaning of life?"}],
)
print(response.choices[0].message.content)
Quickstart (raw HTTP)
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-5.2","messages":[{"role":"user","content":"hi"}]}'
Model identifiers
Format: <provider>/<model> (e.g., anthropic/claude-sonnet-4.6, openai/gpt-5.2, google/gemini-3-flash-preview, meta-llama/llama-3.3-70b-instruct, deepseek/deepseek-v3.2). Browse all: https://openrouter.ai/models. List via API: GET /api/v1/models?supported_parameters=tools&output_modalities=text.
Suffix shortcuts
| Suffix | Effect | Equivalent |
|---|---|---|
:nitro |
Sort providers by throughput | provider.sort = "throughput" |
:floor |
Sort providers by lowest price | provider.sort = "price" |
:online |
Enable web plugin (deprecated — prefer openrouter:web_search server tool) |
plugins: [{id: "web"}] with openrouter/auto |
:free |
Free-tier variants (rate-limited, daily caps) | — |
Combinable: openai/gpt-oss-20b:free:online.
Special model slugs
openrouter/auto— Auto Router (NotDiamond-powered prompt-aware model selection)openrouter/bodybuilder— natural-language → structured request bodies for parallel multi-model fan-out (free)@preset/<slug>— invoke a saved preset (system prompt + params + provider rules) by name. Configure at https://openrouter.ai/settings/presets. Also usable aspreset: "<slug>"field ormodel: "openai/gpt-4@preset/<slug>"(combined). Per-request params shallow-merge over preset config.
Request body — top-level shape
type Request = {
// Either messages or prompt
messages?: Message[];
prompt?: string;
model?: string; // omit = user default
models?: string[]; // model fallbacks in order
stream?: boolean;
stop?: string | string[];
response_format?: ResponseFormat; // json_object | json_schema
tools?: Tool[];
tool_choice?: ToolChoice;
parallel_tool_calls?: boolean; // default true
// OpenRouter-only
provider?: ProviderPreferences; // see references/routing.md
plugins?: Plugin[]; // web, file-parser, response-healing, context-compression, auto-router
reasoning?: ReasoningConfig; // see references/reasoning.md
preset?: string;
user?: string; // stable end-user ID for abuse detection
debug?: { echo_upstream_body?: boolean }; // streaming only
// Standard sampling — see references/api-reference.md
max_tokens?: number;
temperature?: number;
top_p?: number; top_k?: number; min_p?: number; top_a?: number;
frequency_penalty?: number; presence_penalty?: number; repetition_penalty?: number;
seed?: number; logit_bias?: Record<number, number>;
logprobs?: boolean; top_logprobs?: number;
prediction?: { type: 'content'; content: string };
verbosity?: 'low' | 'medium' | 'high' | 'xhigh' | 'max';
};
Unsupported parameters for the chosen model are silently ignored; the rest forward to the upstream. To require providers that support all parameters, set provider.require_parameters: true.
Response shape
{
"id": "gen-xxxxxxxxxxxxxx",
"model": "openai/gpt-4o",
"object": "chat.completion",
"choices": [{
"finish_reason": "stop",
"native_finish_reason": "stop",
"message": { "role": "assistant", "content": "Hello there!" }
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 4,
"total_tokens": 14,
"prompt_tokens_details": { "cached_tokens": 0, "cache_write_tokens": 0 },
"completion_tokens_details": { "reasoning_tokens": 0 },
"cost": 0.00014
}
}
Streaming chunks have delta instead of message and object: "chat.completion.chunk". The final chunk before data: [DONE] carries usage with empty choices.
finish_reason is normalized to one of: tool_calls, stop, length, content_filter, error. Raw provider value is in native_finish_reason. The generation ID is also returned in the X-Generation-Id response header.
Errors
{ "error": { "code": 400, "message": "...", "metadata": { ... } } }
| Code | Meaning |
|---|---|
| 400 | Bad request / invalid params / CORS |
| 401 | Invalid API key / expired OAuth |
| 402 | Insufficient credits — add credits |
| 403 | Moderation flagged input |
| 408 | Request timeout |
| 429 | Rate limited |
| 502 | Upstream model down / invalid response |
| 503 | No provider meets routing requirements |
Mid-stream errors (HTTP 200 already sent) arrive as a final SSE chunk with top-level error and a choice with finish_reason: "error". See streaming.md.
Feature map
| Feature | Trigger | Reference |
|---|---|---|
| Sampling params, full request/response schema, debug option | any chat call | references/api-reference.md |
| SSE streaming, cancellation, mid-stream errors | stream: true |
references/streaming.md |
| Function/tool calling, agentic loop, parallel tools, streaming tool calls, MCP-to-OpenAI conversion | tools: [...] |
references/tool-calling.md |
Structured outputs (json_schema, json_object) |
response_format |
references/structured-outputs.md |
| Reasoning/thinking tokens (effort, max_tokens, exclude) | reasoning: {...} |
references/reasoning.md |
Prompt caching (auto, cache_control, sticky routing) |
cache_control or supported provider |
references/prompt-caching.md |
| Provider routing (sort, order, only, ignore, max_price, throughput, ZDR) | provider: {...} |
references/routing.md |
| Model fallbacks, Auto Router, Body Builder, presets | models[], openrouter/auto, @preset/... |
references/routing.md |
| Multimodal — images, PDFs, audio, video, image-gen, TTS | image_url/file/input_audio content parts |
references/multimodal.md |
Plugins — web, file-parser, response-healing, context-compression, auto-router |
plugins: [...] |
references/plugins.md |
openrouter:web_search server tool, :online shortcut, citations annotations |
tools: [{type:'openrouter:web_search'}] |
references/web-search.md |
Usage accounting, costs, /generation, key/credits, rate limits |
usage field, GET /api/v1/key |
references/usage-and-limits.md |
| BYOK provider keys, OAuth PKCE, app attribution | cbat_-style provider keys configured at /settings/integrations |
references/byok-and-oauth.md |
Always-fresh docs
OpenRouter exposes Markdown of any docs page by appending .md to the URL, plus aggregated llms-full.txt files per section. When reference material here is insufficient for a specific edge case, fetch the latest from:
- Quickstart: https://openrouter.ai/docs/quickstart/llms-full.txt
- API reference: https://openrouter.ai/docs/api/reference/llms-full.txt
- Features: https://openrouter.ai/docs/guides/features/llms-full.txt
- Routing: https://openrouter.ai/docs/guides/routing/llms-full.txt
- Plugins: https://openrouter.ai/docs/guides/features/plugins/llms-full.txt
- Multimodal: https://openrouter.ai/docs/guides/overview/multimodal/llms-full.txt
- Best practices (caching, reasoning): https://openrouter.ai/docs/guides/best-practices/llms-full.txt
- Auth (BYOK, OAuth): https://openrouter.ai/docs/guides/overview/auth/llms-full.txt
- Administration (usage accounting): https://openrouter.ai/docs/guides/administration/llms-full.txt
- Server tools (web search): https://openrouter.ai/docs/guides/features/server-tools/llms-full.txt
- OpenAPI spec:
https://openrouter.ai/openapi.yamlor.json
Append .md to any docs URL for clean Markdown of that single page.
Common pitfalls
promptandmessagesare mutually exclusive — pick one.- The
toolsarray must be repeated on every turn of an agentic loop. The router validates the schema each call. Forgetting to append the assistant's tool-call message to history before the tool-result message breaks Anthropic models. - Anthropic automatic caching (top-level
cache_control) only routes to the Anthropic provider — Bedrock and Vertex AI are excluded. Per-blockcache_controlworks on all three. max_tokenscap iscontext_length - prompt_length, not unbounded.- OpenAI o-series and some reasoning models do not return reasoning tokens even when present internally.
reasoning.exclude: truecontrols visibility, not internal use. debug.echo_upstream_bodyonly works withstream: trueand is dev-only — it can leak request internals.:onlineshortcut is the deprecatedwebplugin routed throughopenrouter/auto. New code uses theopenrouter:web_searchserver tool so the model decides when to search.- Mid-stream errors keep HTTP 200 because headers were already sent. Detect via the top-level
errorfield on a chunk plusfinish_reason: "error". - BYOK keys always try first before OpenRouter shared capacity, regardless of
provider.order. - Free-tier (
:free) rate limits are global per account — extra API keys do not multiply quota. - The
usage: { include: true }andstream_options: { include_usage: true }parameters are deprecated — usage is always included now.
More from thatjuan/agent-skills
logo-studio
Professional logo design studio that produces 9+ SVG logo concepts through brand discovery, archetype mapping, and iterative refinement, then generates a complete app asset package (iOS, Android, macOS, Windows, favicons, PWA, social) from the final selection and optionally produces a multi-page brand guidelines document (logo, color, typography, layout, UI components, motion, voice, asset management). Use when the user asks for a logo, brand mark, icon, wordmark, app icon, visual identity, or brand guidelines for a business, product, or project.
11team-executor
Multi-agent orchestration that transforms braindumps into executed results. Assembles expert planning teams (3-7 agents), produces comprehensive execution plans, then deploys fresh execution teams for autonomous delivery. Use when the user describes goals, features, projects, or pastes scattered ideas needing organization and execution. Triggers on "build this", "execute this", "make this happen", "plan and build", or any substantial task description.
11creative-director
World-class creative director for branding, web design, and UI concepts. Use when the user asks for a new design concept, brand identity, website creative direction, UI experience concept, visual identity, or creative strategy for a business, product, or project. Produces detailed, richly described creative concepts — not code or implementations.
11temporal
Expert Temporal.io workflow orchestration for Python and TypeScript. Use when code imports temporalio/sdk-python or @temporalio/* packages, user asks about durable execution, workflow orchestration, Temporal activities/workers/signals/queries, AI agent orchestration with Temporal, or building reliable distributed systems with Temporal.
6heroui
HeroUI v3 component library expertise for React (web) and React Native (mobile). Use when code imports @heroui/react, @heroui/styles, or heroui-native, user asks to build UI with HeroUI, or references HeroUI components, theming, or migration from NextUI/HeroUI v2.
4drizzle-orm
Type-safe SQL ORM for TypeScript with zero runtime overhead. Use when code imports drizzle-orm, drizzle-kit, or drizzle-orm/pg-core, user asks about Drizzle schema design, queries, relations, migrations, or database management with Drizzle ORM. Covers PostgreSQL focus with pgTable, pgEnum, pgSchema, pgView, and drizzle-kit CLI.
4