venice-api-overview
Venice API Overview
Venice.ai is an OpenAI-compatible inference platform for text, image, audio, video, and embeddings. One API — two ways to pay: a traditional API key (Pro account), or a wallet (x402, USDC on Base, no account required).
Use when
- You're writing code against
api.venice.aifor the first time. - You need to decide between API-key and x402/wallet authentication.
- You want a quick map of which endpoint to call for which task.
- You need to understand the common response headers (
X-Balance-Remaining,PAYMENT-REQUIRED, etc.).
Base URL
All endpoints live under:
https://api.venice.ai/api/v1
The OpenAPI spec is distributed at outerface/swagger.yaml (current version 20260420.235001).
Authentication (pick one per request)
| Scheme | Header | Best for |
|---|---|---|
BearerAuth |
Authorization: Bearer <VENICE_API_KEY> |
Server-side apps, dashboards, usage analytics, bundled credits |
siwx (x402) |
X-Sign-In-With-X: <base64 SIWE JSON> |
No account, pay-as-you-go with USDC on Base, serverless / agents |
Every inference endpoint accepts either — see venice-auth.
# Bearer
curl https://api.venice.ai/api/v1/models \
-H "Authorization: Bearer $VENICE_API_KEY"
# x402 / SIWE (one-liner via the SDK)
import { VeniceClient } from 'venice-x402-client'
const v = new VeniceClient(process.env.WALLET_KEY)
await v.models.list()
Endpoint map
Inference
| Category | Endpoints | Skill |
|---|---|---|
| Chat | POST /chat/completions |
venice-chat |
| Responses (Alpha) | POST /responses |
venice-responses |
| Embeddings | POST /embeddings |
venice-embeddings |
| Image gen | POST /image/generate, POST /images/generations, GET /image/styles |
venice-image-generate |
| Image edit | POST /image/edit, POST /image/multi-edit, POST /image/upscale, POST /image/background-remove |
venice-image-edit |
| TTS | POST /audio/speech |
venice-audio-speech |
| STT | POST /audio/transcriptions |
venice-audio-transcription |
| Music (async) | POST /audio/quote, /audio/queue, /audio/retrieve, /audio/complete |
venice-audio-music |
| Video (async) | POST /video/quote, /video/queue, /video/retrieve, /video/complete, /video/transcriptions |
venice-video |
Catalog
| Category | Endpoints | Skill |
|---|---|---|
| Models | GET /models, /models/traits, /models/compatibility_mapping |
venice-models |
| Characters | GET /characters, /characters/{slug}, /characters/{slug}/reviews |
venice-characters |
Account, billing, wallet
| Category | Endpoints | Skill |
|---|---|---|
| API keys | `GET | POST |
| Billing | GET /billing/balance, /billing/usage, /billing/usage-analytics |
venice-billing |
| x402 wallet | GET /x402/balance/{wallet}, POST /x402/top-up, GET /x402/transactions/{wallet} |
venice-x402 |
Utility
| Category | Endpoints | Skill |
|---|---|---|
| Crypto RPC proxy | GET /crypto/rpc/networks, POST /crypto/rpc/{network} |
venice-crypto-rpc |
| Augment | POST /augment/text-parser, /augment/scrape, /augment/search |
venice-augment |
Response headers to watch
| Header | When | Meaning |
|---|---|---|
X-Balance-Remaining |
x402 inference success | USDC credits left, e.g. "4.230000" |
X-RateLimit-Limit-* / X-RateLimit-Remaining-* |
all inference | your current per-minute/day caps |
PAYMENT-REQUIRED |
402 on x402 inference |
base64 JSON with top-up + SIWX challenge (x402 v2) |
Content-Encoding |
200 when client sent Accept-Encoding: gzip, br |
compression (embeddings, chat) |
Pricing model at a glance
- Pricing is dynamic per request, metered in USD.
- Paid inference endpoints in the spec carry an
x-payment-infoblock withminandmaxbounds in USD (typicallymin: 0.001,max: 10.00; higher for bulk video/audio). Read-only discovery routes likeGET /models,/models/traits, and/models/compatibility_mappingdo not. - Pro (Bearer) accounts draw from DIEM (staked credits), USD balance, and bundled credits in priority order.
- x402 (wallet) users draw from a prepaid USDC credit balance on Base.
- The authoritative per-model price is on
GET /models→model_spec.pricing(when present — video models omit it; use/video/quotefor video pricing) (seevenice-models).
Standard error shape
Every error body follows one of:
{ "error": "Human-readable message" }
or, for 400 validation errors:
{ "error": "...", "details": { "fieldName": { "_errors": ["Field is required"] } } }
402 on x402 adds structured topUpInstructions and siwxChallenge. See venice-errors for the full table and retry strategy.
OpenAI compatibility — what works and what doesn't
- Drop-in:
/chat/completions,/embeddings,/images/generations,/audio/speech,/audio/transcriptions,/models. - Ignored but accepted for compat:
user,store. - Venice-only extensions live under:
venice_parameters(chat completions)venice_parametersis rejected on/responses— use headers / native fields instead
- Model feature suffixes (e.g.
zai-org-glm-5-1:enable_web_search=on,kimi-k2-6:strip_thinking_response=true&disable_thinking=true) flipvenice_parametersvia the model ID — seevenice-chat.
Versioning
info.versioninswagger.yamlis a timestamp (YYYYMMDD.HHMMSS). There is no/v2; features roll forward on the single/api/v1surface and are guarded by:- Alpha/Beta tags in endpoint descriptions (e.g.
/responses, Billing). x-guidance/ model capability flags on/models.
- Alpha/Beta tags in endpoint descriptions (e.g.
- Always check the model's
model_spec.capabilitiesfromGET /modelsfor feature flags (supportsWebSearch,supportsReasoning,supportsE2EE,supportsXSearch,supportsMultipleImages,supportsFunctionCalling,supportsAudioInput,supportsVideoInput, …) before relying on a feature.
Fast start checklist
- Read
venice-authand choose Bearer vs x402. GET /models— pick a model and note itsmodel_spec.constraintsandmodel_spec.pricing.- Wire up one happy-path call from the matching skill.
- Add error handling using
venice-errors(402, 422, 429). - Hook up observability via
X-Balance-Remaining//billing/usage//x402/transactions.
More from veniceai/skills
venice-audio-transcription
Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.
29venice-video
Generate and transcribe videos via Venice. Covers the async /video/quote + /video/queue + /video/retrieve + /video/complete loop, text-to-video, image-to-video, video-to-video (upscale), audio input, reference images, scene and element support, plus /video/transcriptions for YouTube URLs.
28venice-audio-speech
Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.
28venice-image-generate
Generate images with Venice. Covers POST /image/generate (Venice-native), POST /images/generations (OpenAI-compatible), GET /image/styles (style presets), request fields (prompt, dimensions, cfg_scale, seed, variants, style_preset, aspect_ratio, resolution, safe_mode, watermark), and response formats.
28venice-embeddings
Call POST /embeddings on Venice. Covers request shape (input, model, encoding_format, dimensions, user), OpenAI compatibility, response compression (gzip/br), and practical usage for retrieval, clustering, and RAG.
28venice-errors
Handle Venice API errors correctly. Covers the StandardError / DetailedError / ContentViolationError / X402InferencePaymentRequired body shapes, every meaningful status code (400, 401, 402, 403, 415, 422, 429, 500, 503, 504), the 402 PAYMENT-REQUIRED header used by x402 inference, 422 content-policy suggested_prompt retry pattern, 429 rate-limit headers, and an exponential-backoff retry strategy with idempotency.
27