venice-api-keys
Venice API Keys
Admin endpoints for managing Bearer API keys. You need an ADMIN key (or parent session) to call these. For wallet-only auth, use venice-auth / venice-x402 instead.
| Endpoint | Purpose |
|---|---|
GET /api_keys |
List your keys (masked). |
POST /api_keys |
Create a new key. Response contains the only copy of the secret. |
PATCH /api_keys |
Update description, expiresAt, consumptionLimit. |
DELETE /api_keys?id=... |
Revoke a key. |
GET /api_keys/{id} |
Full details for one key (usage, limits, expiration). |
GET /api_keys/rate_limits |
Balances + per-model rate-limit tiers for the current key. |
GET /api_keys/rate_limits/log |
Last 50 rate-limit breaches. |
GET /api_keys/generate_web3_key |
Get a SIWE-style token to sign with a wallet. |
POST /api_keys/generate_web3_key |
Authenticate a wallet (holds sVVV) and mint a classic API key. |
Limits: key creation is capped at 20 requests/minute and 500 active keys per user.
Key types
| Type | Can call |
|---|---|
INFERENCE |
Inference endpoints plus any route that only requires authentication — e.g. /chat/*, /image/*, /audio/*, /video/*, /embeddings, /augment/*, /crypto/rpc, /characters, /api_keys/rate_limits*, /support-bot. Rejected from admin routes listed below with 401. |
ADMIN |
Everything an INFERENCE key can do, plus admin-only routes: POST/PATCH/DELETE /api_keys, GET /api_keys (list), GET /api_keys/{id}, GET /billing/balance, GET /billing/usage. |
A leaf app should almost always use INFERENCE keys — per-app, per-user, with consumption caps.
GET /api_keys
curl https://api.venice.ai/api/v1/api_keys \
-H "Authorization: Bearer $ADMIN_KEY"
Returns:
{
"object": "list",
"data": [
{
"id": "uuid",
"apiKeyType": "INFERENCE",
"description": "backend prod",
"createdAt": "2025-10-01T12:00:00Z",
"expiresAt": null,
"lastUsedAt": "2026-04-20T10:05:00Z",
"last6Chars": "2V2jNW",
"consumptionLimits": { "usd": 50, "diem": 10 },
"usage": { "trailingSevenDays": { "usd": "4.20", "diem": "0.00" } }
}
]
}
The full secret is never returned on list — only last6Chars.
POST /api_keys — create
curl https://api.venice.ai/api/v1/api_keys \
-H "Authorization: Bearer $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"apiKeyType": "INFERENCE",
"description": "backend prod",
"expiresAt": "2026-12-31T23:59:59Z",
"consumptionLimit": { "usd": 50, "diem": 10 }
}'
Response includes the one-time apiKey secret:
{
"success": true,
"data": {
"id": "uuid",
"apiKey": "VENICE_INFERENCE_KEY_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"apiKeyType": "INFERENCE",
"description": "backend prod",
"expiresAt": "2026-12-31T23:59:59Z",
"consumptionLimit": { "usd": 50, "diem": 10 }
}
}
Save it immediately — Venice won't show the secret again. If you lose it, delete and re-create.
Required
apiKeyTypedescription
Optional
expiresAt— empty string or ISO 8601 date/datetime. Omit for non-expiring.consumptionLimit.usd/.diem— per-epoch caps. Null means no cap on that currency.consumptionLimit.vcu— deprecated (legacy Diem). Usedieminstead.
PATCH /api_keys — update
curl -X PATCH https://api.venice.ai/api/v1/api_keys \
-H "Authorization: Bearer $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{ "id": "uuid", "description": "renamed", "consumptionLimit": { "usd": 100 } }'
Only description, expiresAt, and consumptionLimit are mutable. Pass "expiresAt": "" or null to remove an expiration.
DELETE /api_keys?id=<uuid> — revoke
curl -X DELETE "https://api.venice.ai/api/v1/api_keys?id=uuid" \
-H "Authorization: Bearer $ADMIN_KEY"
Returns {"success": true}. Revocation is immediate.
GET /api_keys/{id} — details
Returns one key's full metadata plus trailing-7-day usage. Useful for an admin dashboard row view.
GET /api_keys/rate_limits
curl https://api.venice.ai/api/v1/api_keys/rate_limits \
-H "Authorization: Bearer $VENICE_API_KEY"
Returns for the calling key:
{
"data": {
"accessPermitted": true,
"apiTier": { "id": "paid", "isCharged": true },
"balances": { "USD": 50.23, "DIEM": 100.023 },
"keyExpiration": "2025-06-01T00:00:00Z",
"nextEpochBegins": "2025-05-07T00:00:00.000Z",
"rateLimits": [
{
"apiModelId": "zai-org-glm-5-1",
"rateLimits": [
{ "type": "RPM", "amount": 100 },
{ "type": "TPM", "amount": 200000 },
{ "type": "RPD", "amount": 10000 }
]
}
]
}
}
Use it to:
- Display current balances in-app.
- Warm-gate calls when the relevant model's RPM cap is near.
- Know when the next epoch resets (DIEM, bundled credits).
GET /api_keys/rate_limits/log
Returns the last 50 rate-limit breaches. Response is wrapped as { object: "list", data: [...] }:
{
"object": "list",
"data": [
{ "apiKeyId": "...", "modelId": "zai-org-glm-5-1", "rateLimitType": "RPM",
"rateLimitTier": "paid", "timestamp": "2026-04-20T12:34:56Z" }
]
}
Feed these into your monitoring when tuning concurrency.
Web3 API keys — two-step wallet flow
Lets a wallet that holds sVVV mint a classic Bearer API key. No Venice account required.
1. GET /api_keys/generate_web3_key
curl https://api.venice.ai/api/v1/api_keys/generate_web3_key
Returns { success: true, data: { token: "<jwt-ish token>" } }.
2. Sign the token with your wallet, then POST /api_keys/generate_web3_key
import { Wallet } from 'ethers'
const { data: { token } } = await fetch(`${base}/api_keys/generate_web3_key`).then(r => r.json())
const wallet = new Wallet(process.env.WALLET_KEY!)
const signature = await wallet.signMessage(token)
const res = await fetch(`${base}/api_keys/generate_web3_key`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
apiKeyType: 'INFERENCE',
description: 'Web3 API Key',
address: wallet.address,
signature,
token,
consumptionLimit: { usd: 50 },
}),
})
const { data } = await res.json()
console.log(data.apiKey) // save this once
The returned apiKey behaves exactly like a normal Bearer key.
Recipes
Per-customer keys with $5 USD limit
await fetch(`${base}/api_keys`, {
method: 'POST',
headers: { Authorization: `Bearer ${ADMIN_KEY}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
apiKeyType: 'INFERENCE',
description: `cust:${customerId}`,
consumptionLimit: { usd: 5 },
}),
})
Rotate monthly; revoke on churn.
Health-check for a key
const { data } = await fetch(`${base}/api_keys/rate_limits`, {
headers: { Authorization: `Bearer ${key}` },
}).then(r => r.json())
if (!data.accessPermitted) alert('Key blocked — top up or change tier')
Errors
| Code | Meaning |
|---|---|
400 |
Bad body (e.g. missing apiKeyType, malformed expiresAt), or attempting to create when you already have 500 active keys. |
401 |
Missing / bad / non-admin key for admin-only routes. |
429 |
Exceeded 20 creates/min. |
500 |
Transient; retry. |
Gotchas
- The secret is returned exactly once, in the
POSTresponse. Losing it = delete + recreate. consumptionLimitis per epoch (day / reset cycle), not per call.INFERENCEkeys can't call admin-only routes (POST/PATCH/DELETE /api_keys,GET /api_keys,GET /api_keys/{id},GET /billing/balance,GET /billing/usage). They can callGET /api_keys/rate_limitsand/api_keys/rate_limits/logfor themselves. Use a separateADMINkey for management.vcuis legacy — usediem.expiresAtof empty string""means "no expiration" in CREATE; on UPDATE it removes an existing one.- Rate-limit log is capped at 50 entries — pull it frequently if debugging bursts.
- The Web3 key flow requires wallet holdings of sVVV; otherwise the signing step is rejected.
More from veniceai/skills
venice-audio-transcription
Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.
29venice-video
Generate and transcribe videos via Venice. Covers the async /video/quote + /video/queue + /video/retrieve + /video/complete loop, text-to-video, image-to-video, video-to-video (upscale), audio input, reference images, scene and element support, plus /video/transcriptions for YouTube URLs.
28venice-audio-speech
Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.
28venice-image-generate
Generate images with Venice. Covers POST /image/generate (Venice-native), POST /images/generations (OpenAI-compatible), GET /image/styles (style presets), request fields (prompt, dimensions, cfg_scale, seed, variants, style_preset, aspect_ratio, resolution, safe_mode, watermark), and response formats.
28venice-embeddings
Call POST /embeddings on Venice. Covers request shape (input, model, encoding_format, dimensions, user), OpenAI compatibility, response compression (gzip/br), and practical usage for retrieval, clustering, and RAG.
28venice-errors
Handle Venice API errors correctly. Covers the StandardError / DetailedError / ContentViolationError / X402InferencePaymentRequired body shapes, every meaningful status code (400, 401, 402, 403, 415, 422, 429, 500, 503, 504), the 402 PAYMENT-REQUIRED header used by x402 inference, 422 content-policy suggested_prompt retry pattern, 429 rate-limit headers, and an exponential-backoff retry strategy with idempotency.
27