venice-billing
Venice Billing
Three read-only endpoints for account-level billing and analytics. All are under a Beta tag — schema/behavior may change.
| Endpoint | Purpose |
|---|---|
GET /billing/balance |
Current canConsume flag, remaining DIEM & USD, epoch allocation. |
GET /billing/usage |
Paginated per-request ledger. JSON or CSV. |
GET /billing/usage-analytics |
Aggregated breakdowns: by date, model, API key. |
All require Bearer auth (not x402 — for wallet balances, use venice-x402). GET /billing/balance and GET /billing/usage require an ADMIN key — an INFERENCE key gets 401. GET /billing/usage-analytics works on any authenticated key (scoped to the account behind the key).
Currency / priority
Venice debits from, in order:
DIEM— staked credits (reset per epoch).BUNDLED_CREDITS— included in some Pro plans.USD— prepaid fiat balance.- (
VCU) — deprecated legacy DIEM.
consumptionCurrency on /billing/balance reports the current currency being consumed.
GET /billing/balance
curl https://api.venice.ai/api/v1/billing/balance \
-H "Authorization: Bearer $VENICE_API_KEY"
{
"canConsume": true,
"consumptionCurrency": "DIEM",
"balances": { "diem": 90.5, "usd": 25 },
"diemEpochAllocation": 100
}
canConsume: falsemeans both DIEM and USD buckets are empty on this endpoint —canConsumehere ishasPositiveDiemBalance || usdBalance > 0and does not factor in bundled credits (which are consulted during the actual request ingetConsumableBalanceForRequest).consumptionCurrencyis"DIEM","USD", ornull(when neither applies).balances.diemisnullif not staking.diemEpochAllocationis the ceiling for the current epoch —balances.diem / diemEpochAllocation= remaining fraction.
GET /billing/usage
Paginated per-request ledger.
curl "https://api.venice.ai/api/v1/billing/usage?limit=200&page=1&sortOrder=desc¤cy=USD&startDate=2026-04-01T00:00:00Z&endDate=2026-04-21T23:59:59Z" \
-H "Authorization: Bearer $VENICE_API_KEY" \
-H "Accept: application/json"
Query parameters
| Param | Notes |
|---|---|
currency |
USD / VCU / DIEM / BUNDLED_CREDITS. |
startDate / endDate |
ISO 8601 datetime. |
limit |
1–500. Default 200. |
page |
Default 1. |
sortOrder |
asc / desc on createdAt. Default desc. |
Accept header
application/json(default) — paginated JSON.text/csv— downloadsbilling-usage.csv(setsContent-Disposition).
Response (JSON)
{
"warningMessage": "DIEM (formerly VCU) has been renamed...",
"data": [
{
"timestamp": "2026-04-20T12:34:56Z",
"sku": "zai-org-glm-5-1-llm-output-mtoken",
"units": 0.000227,
"pricePerUnitUsd": 2.8,
"amount": -0.06356,
"currency": "DIEM",
"notes": "API Inference",
"inferenceDetails": {
"requestId": "chatcmpl-...",
"promptTokens": 339,
"completionTokens": 227,
"inferenceExecutionTime": 2964
}
}
],
"pagination": { "limit": 200, "page": 1, "total": 1000, "totalPages": 5 }
}
Response headers: x-pagination-{limit,page,total,total-pages}.
Fields
sku— billing line item (model + unit type + format).units— for LLMs, millions of tokens (e.g.0.000227= 227 tokens).pricePerUnitUsd— rate; for DIEM, DIEM ≈ USD so this doubles as reference.amount— negative for debit.inferenceDetails— present for inference SKUs;requestIdis theidreturned on the original/chat/completionsresponse.
GET /billing/usage-analytics
Aggregated summary for dashboards. Cached 10 minutes.
curl "https://api.venice.ai/api/v1/billing/usage-analytics?lookback=7d" \
-H "Authorization: Bearer $VENICE_API_KEY"
Query parameters (choose one approach)
lookback=Nd—7d,30d, up to90d. Default7d.- OR
startDate=YYYY-MM-DD+endDate=YYYY-MM-DD— both required if either is given.
Response (selected keys)
{
"lookback": "7d",
"byDate": [{ "date": "2026-04-20", "USD": 0.5, "DIEM": 10.25 }, ...],
"byModel": [
{
"modelName": "GLM 5.1",
"unitType": "tokens",
"modelType": "LLM",
"totalUsd": 0.4,
"totalDiem": 12.5,
"totalUnits": 50000,
"breakdown": [
{ "type": "Output", "usd": 0.3, "diem": 10, "units": 35000 },
{ "type": "Input", "usd": 0.1, "diem": 2.5, "units": 15000 }
]
}
],
"byModelDaily": [
{ "date": 1705276800000, "GLM 5.1": 5.5, "Claude Opus 4.7": 3.2 }
],
"byModelDailyUsd": [...],
"topModels": ["GLM 5.1", "Claude Opus 4.7"],
"byKey": [
{ "apiKeyId": "key_abc123", "description": "Production Key",
"totalUsd": 0.8, "totalDiem": 15, "totalUnits": 75000 },
{ "apiKeyId": null, "description": "Web App",
"totalUsd": 0, "totalDiem": 4, "totalUnits": 25000 }
],
"byKeyDaily": [...],
"byKeyDailyUsd": [...],
"topKeyNames": [...]
}
byDate/byModelDaily/byKeyDailyare pre-shaped for time-series charts.topModels/topKeyNamesgive top-8 names for legend rendering.apiKeyId: nullinbyKeymeans the usage originated from Venice's web app.
Recipes
Abort before calling inference if balance is empty
const { canConsume } = await fetch(`${base}/billing/balance`, { headers }).then(r => r.json())
if (!canConsume) throw new Error('Venice balance exhausted — top up before continuing')
Monthly CSV export
curl "https://api.venice.ai/api/v1/billing/usage?startDate=2026-04-01T00:00:00Z&endDate=2026-04-30T23:59:59Z&limit=500" \
-H "Authorization: Bearer $VENICE_API_KEY" \
-H "Accept: text/csv" \
-o billing-april.csv
Paginate via page=1,2,3,... until page > totalPages.
Top-models chart
const a = await fetch(`${base}/billing/usage-analytics?lookback=30d`, { headers }).then(r => r.json())
// chart(a.byModelDaily, { series: a.topModels, xField: 'date' })
Errors
| Code | Meaning |
|---|---|
400 |
Bad params (startDate without endDate, calendar range > 90 days). lookback=100d is silently clamped to 90 days rather than rejected. |
401 |
Auth failed, or INFERENCE key used on /billing/balance or /billing/usage (ADMIN required). |
500 |
Internal error. |
504 |
Analytics query timed out — shorten lookback or date range. |
Gotchas
- This is Beta — field names may shift. Validate against
swagger.yamlperiodically. currencyvalues include legacyVCU— useDIEMinstead in new code.inferenceDetailsisnullfor non-inference SKUs (e.g. subscription charges).- The analytics endpoint is cached 10 min — sudden spikes lag in the dashboard by that window.
byModelDaily.dateis a Unix milliseconds integer;byDate.dateis aYYYY-MM-DDstring. Don't mix them.- Usage entries from the Venice web app have
apiKeyId: null— don't drop them when reconciling. - For x402 (wallet) balance, don't use this endpoint — use
GET /x402/balance/{walletAddress}.
More from veniceai/skills
venice-audio-transcription
Transcribe audio files to text via POST /audio/transcriptions. Covers supported models (Parakeet, Whisper, Wizper, Scribe, xAI STT), supported formats (wav/flac/m4a/aac/mp4/mp3/ogg/webm), response formats (json/text), timestamps, and language hints. OpenAI-compatible multipart.
31venice-video
Generate and transcribe videos via Venice. Covers the async /video/quote + /video/queue + /video/retrieve + /video/complete loop, text-to-video, image-to-video, video-to-video (upscale), audio input, reference images, scene and element support, plus /video/transcriptions for YouTube URLs.
29venice-audio-speech
Generate speech from text via POST /audio/speech. Covers TTS models (Kokoro, Qwen 3, xAI, Inworld, Chatterbox, Orpheus, ElevenLabs Turbo, MiniMax, Gemini Flash), voices per family, output formats (mp3/opus/aac/flac/wav/pcm), streaming, prompt/emotion styling, temperature/top_p, and language hints.
29venice-image-generate
Generate images with Venice. Covers POST /image/generate (Venice-native), POST /images/generations (OpenAI-compatible), GET /image/styles (style presets), request fields (prompt, dimensions, cfg_scale, seed, variants, style_preset, aspect_ratio, resolution, safe_mode, watermark), and response formats.
29venice-embeddings
Call POST /embeddings on Venice. Covers request shape (input, model, encoding_format, dimensions, user), OpenAI compatibility, response compression (gzip/br), and practical usage for retrieval, clustering, and RAG.
29venice-errors
Handle Venice API errors correctly. Covers the StandardError / DetailedError / ContentViolationError / X402InferencePaymentRequired body shapes, every meaningful status code (400, 401, 402, 403, 415, 422, 429, 500, 503, 504), the 402 PAYMENT-REQUIRED header used by x402 inference, 422 content-policy suggested_prompt retry pattern, 429 rate-limit headers, and an exponential-backoff retry strategy with idempotency.
28