venice-errors
Venice errors & retries
Every Venice endpoint returns one of four error shapes. Knowing which shape you got tells you how to react.
Error body shapes
1. StandardError — simple message
The default shape for 4xx/5xx. Emitted when there's nothing structured to surface.
{ "error": "Unauthorized" }
2. DetailedError — Zod validation failure
Used for some 400 responses on malformed request bodies. When present, details is a Zod format() tree (_errors recursively keyed by field) alongside a flat issues array. Many 400s are plain StandardError without details — always handle both.
{
"error": "Invalid request",
"details": {
"_errors": [],
"messages": { "_errors": ["Field is required"] }
},
"issues": [
{ "code": "invalid_type", "path": ["messages"], "message": "Field is required" }
]
}
Render details / issues to the user so they can fix the input; don't retry — the request shape is wrong.
3. ContentViolationError — 422 content policy
Returned when a prompt trips content policy. suggested_prompt (a model-provided safe alternative) is currently emitted by the audio generation pipeline (/audio/queue, /audio/retrieve); image and video endpoints return { error: "Content policy violation" } without suggested_prompt.
{
"error": "Content policy violation",
"suggested_prompt": "A cinematic instrumental track inspired by stormy weather and dramatic tension."
}
Pattern — when suggested_prompt is present, retry once with prompt = suggested_prompt if the user consents.
4. X402InferencePaymentRequired — 402 on x402 inference calls
Returned only when the caller authenticated with SIWE and has insufficient credit. Discriminated by code: "PAYMENT_REQUIRED".
{
"error": "Payment required",
"code": "PAYMENT_REQUIRED",
"message": "Insufficient x402 balance",
"suggestedTopUpUsd": 10,
"minimumTopUpUsd": 5,
"supportedTokens": ["USDC"],
"supportedChains": ["base"],
"topUpInstructions": {
"step1": "POST /api/v1/x402/top-up with no payment header to get payment requirements",
"step2": "Sign a USDC transfer authorization using the x402 SDK (createPaymentHeader)",
"step3": "POST /api/v1/x402/top-up with the signed X-402-Payment header",
"receiverWallet": "<RECEIVER_WALLET_ADDRESS>",
"tokenAddress": "<USDC_TOKEN_ADDRESS>",
"tokenDecimals": 6,
"network": "eip155:8453",
"minimumAmountUsd": 5
},
"siwxChallenge": { ... SIWE template ... }
}
The PAYMENT-REQUIRED response header carries a base64-encoded x402 v2 paymentRequired object (x402Version, error, resource, accepts[], optional extensions) — it is not the same JSON as the body. Protocol-level clients parse the header; human-facing clients parse the richer body. See venice-x402.
Status code map
| Status | Body | Meaning | What to do |
|---|---|---|---|
400 Bad Request |
DetailedError |
Malformed input. Zod details identifies the field. |
Fix and re-send. Don't retry. |
401 Unauthorized |
StandardError |
Missing / invalid Bearer API key or SIWE. | Rotate credentials. Don't retry. |
402 Payment Required |
Bearer: StandardError with the configured message (e.g. { "error": "Insufficient balance" } — the handler's default path does not attach a code field). SIWE: X402InferencePaymentRequired + PAYMENT-REQUIRED header. |
Out of DIEM/USD/wallet credit. | Bearer: top up at venice.ai. SIWE: run the x402 top-up flow. |
403 Forbidden |
StandardError |
Valid auth but not entitled. Typical: trial-limited endpoint, beta model, API-key consumption cap hit, SIWE signer ≠ path wallet. | Don't retry. Investigate entitlements. |
415 Unsupported Media Type |
StandardError |
Wrong Content-Type (e.g. JSON sent to a multipart endpoint, or vice versa). |
Fix headers. Don't retry. |
422 Unprocessable Entity |
ContentViolationError on image/audio/video generation; plain { error } on other routes (e.g. ASR validation errors). |
Content policy violation on generation paths; schema-ish validation on others. | On audio generation, optionally retry once with suggested_prompt. On others, fix input. |
429 Too Many Requests |
StandardError |
Rate limit cap tripped. Also returned by /crypto/rpc/{network} when credit-per-day or concurrency cap tripped. |
Honor X-RateLimit-* headers, back off with jitter. |
500 Internal Server Error |
StandardError |
Unexpected failure. | Retry with exponential backoff + idempotency key where supported. |
503 Service Unavailable |
StandardError |
Upstream model / service temporarily down. | Retry with backoff. Consider a fallback model. |
504 Gateway Timeout |
StandardError |
Upstream slow. Mostly on /chat/completions with huge contexts. |
Switch to stream: true or shorter prompts. |
Rate-limit headers (429)
Emitted on /crypto/rpc/{network}:
| Header | Meaning |
|---|---|
X-RateLimit-Limit |
Per-minute request cap for your tier (paid = 100, staff = 1000 on crypto RPC). |
X-RateLimit-Remaining |
Requests remaining in the current 60-second window. |
X-RateLimit-Reset |
Unix timestamp in seconds when the window resets. |
Additionally, LlmInferenceError model-overloaded conditions set a Retry-After header (seconds) on the 429 — honor it when present.
Inference endpoints (chat, image, audio, video) use a per-API-key tier defined via /api_keys/rate_limits. See venice-api-keys to pre-fetch your caps, and venice-billing for DIEM/USD usage.
Response headers on 402 (x402)
| Header | Notes |
|---|---|
PAYMENT-REQUIRED |
Base64-encoded JSON of the x402 v2 paymentRequired object (x402Version, error, resource, accepts[], optional extensions['sign-in-with-x']). Protocol-level discovery — parse even if you don't parse the JSON body. |
Retry strategy
Never retry
400— bad input. Fix the request.401— bad auth. Fix credentials.403— not entitled. Don't hammer.415— wrongContent-Type.
Retry with modification
402(x402) — run top-up then retry.402(Bearer) — surface to user; top up at venice.ai.422withsuggested_prompt— one retry with the safer prompt.
Retry with backoff
429— back off for at leastX-RateLimit-Reset - now(). Add jitter.500/503/504— exponential backoff (e.g. 0.5s, 1s, 2s, 4s, 8s), capped at ~30s. 3–5 retries max.- Use
Idempotency-Key(e.g. on/crypto/rpc/{network}) so retries can't double-bill state-mutating calls.
Reference retry loop
async function callVenice<T>(fn: () => Promise<Response>): Promise<T> {
const maxRetries = 5
let delay = 500
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const res = await fn()
if (res.ok) return res.json() as Promise<T>
const body = await res.clone().json().catch(() => ({}))
const { status } = res
if ([400, 401, 403, 415].includes(status)) {
throw Object.assign(new Error(body.error ?? 'Venice error'), { status, body })
}
if (status === 402 && body.code === 'PAYMENT_REQUIRED') {
await topUpX402(body.suggestedTopUpUsd)
continue
}
if (status === 422) {
throw Object.assign(new Error('Content policy'), { status, body })
}
if (status === 429) {
const retryAfterSec = Number(res.headers.get('retry-after'))
const resetSec = Number(res.headers.get('x-ratelimit-reset'))
const waitMs = !Number.isNaN(retryAfterSec) && retryAfterSec > 0
? retryAfterSec * 1000
: !Number.isNaN(resetSec) && resetSec > 0
? Math.max(resetSec * 1000 - Date.now(), delay)
: delay
await sleep(waitMs + Math.random() * 250)
delay *= 2
continue
}
if (status >= 500 && attempt < maxRetries) {
await sleep(delay + Math.random() * 250)
delay *= 2
continue
}
throw Object.assign(new Error(body.error ?? 'Venice error'), { status, body })
}
throw new Error('Exceeded max retries')
}
Streaming errors
Streaming responses (stream: true on chat, TTS, video-queue progress) deliver mid-stream errors as SSE events:
data: {"error": {"type": "…", "message": "…"}}
Treat them as terminal — the underlying connection is closed. The HTTP status is 200 because a successful stream can't be changed mid-flight.
Request-ID correlation
When present on a response, keep the X-Request-ID header. Include it in support tickets — Venice keys diagnostic logs by this ID. /crypto/rpc/* routes set it explicitly; many inference routes also include it, but don't assume it's universal — fall back to your own client-side correlation ID.
Common gotchas
- A
402from/x402/top-upwith noX-402-Paymentheader is the expected discovery response, not an error. Seevenice-x402. - A
500on/chat/completionswith a huge file upload often means the upstream model chose to abort — reducemax_tokens/ image size rather than blindly retrying. 429on/crypto/rpc/{network}may mean the 24-hour credit cap tripped, not the per-minute one. CheckcustomMessage.DetailedError.detailsis a Zod_errorstree, not a flat map. Walk it recursively.- Some endpoints (image generation) echo
X-Rate-Limitvariants — treat any header whose name starts withX-RateLimitas advisory. - Don't treat an empty
streamchunk as an error — send-keepalives look likedata: [DONE]or empty lines.
More from veniceai/skills
venice-audio-transcription
Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.
29venice-video
Generate and transcribe videos via Venice. Covers the async /video/quote + /video/queue + /video/retrieve + /video/complete loop, text-to-video, image-to-video, video-to-video (upscale), audio input, reference images, scene and element support, plus /video/transcriptions for YouTube URLs.
28venice-audio-speech
Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.
28venice-image-generate
Generate images with Venice. Covers POST /image/generate (Venice-native), POST /images/generations (OpenAI-compatible), GET /image/styles (style presets), request fields (prompt, dimensions, cfg_scale, seed, variants, style_preset, aspect_ratio, resolution, safe_mode, watermark), and response formats.
28venice-embeddings
Call POST /embeddings on Venice. Covers request shape (input, model, encoding_format, dimensions, user), OpenAI compatibility, response compression (gzip/br), and practical usage for retrieval, clustering, and RAG.
28venice-audio-music
Async music / audio-track generation via Venice. Covers the /audio/quote + /audio/queue + /audio/retrieve + /audio/complete lifecycle, lyrics vs instrumental, voice selection, duration, language, speed, model capability probing, and webhook-free polling.
27