chat
SKILL.md
Chat Completions — Sarvam AI
[!IMPORTANT] Auth:
api-subscription-keyheader — NOTAuthorization: Bearer. Base URL:https://api.sarvam.ai/v1
Models
| Model | Context | Best For |
|---|---|---|
sarvam-105b |
128K | Complex reasoning, coding, agentic workflows |
sarvam-30b |
64K | Real-time chat, voice agents, conversational AI |
sarvam-105b-32k |
32K | Cost-efficient 105B |
sarvam-30b-16k |
16K | Cost-efficient 30B |
Quick Start (Python)
from sarvamai import SarvamAI
client = SarvamAI()
response = client.chat.completions(
model="sarvam-30b",
messages=[{"role": "user", "content": "भारत की राजधानी क्या है?"}]
)
print(response.choices[0].message.content)
Streaming (Python)
for chunk in client.chat.completions(
model="sarvam-30b",
messages=[{"role": "user", "content": "Write a poem about India"}],
stream=True
):
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Quick Start (JavaScript/TypeScript)
import { SarvamAIClient } from "sarvamai";
const client = new SarvamAIClient({ apiSubscriptionKey: "YOUR_SARVAM_API_KEY" });
const response = await client.chat.completions({
model: "sarvam-30b",
messages: [{ role: "user", content: "भारत की राजधानी क्या है?" }]
});
console.log(response.choices[0].message.content);
OpenAI-Compatible (both languages)
from openai import OpenAI
client = OpenAI(api_key="your-key", base_url="https://api.sarvam.ai/v1")
response = client.chat.completions.create(model="sarvam-30b", messages=[...])
Gotchas
| Gotcha | Detail |
|---|---|
| SDK method | Python: client.chat.completions(...), JS: client.chat.completions({...}) — no .create() in either. OpenAI SDK uses .create() as usual. |
| JS constructor | new SarvamAIClient({ apiSubscriptionKey: "..." }) — NOT SarvamAI(). Key is passed explicitly. |
content can be None |
Models produce reasoning_content before content. If max_tokens is too low, reasoning consumes the budget and content is None. Omit max_tokens or set 500+. Check reasoning_content as fallback. |
| reasoning_effort | reasoning_effort="low"|"medium"|"high" for thinking mode. NOT thinking=True. |
Full Docs
Fetch detailed parameters, tool calling, streaming, and examples from:
- https://docs.sarvam.ai/llms.txt — comprehensive docs index
- Chat Completion Guide
- Model Specs
- Rate Limits
Weekly Installs
15
Repository
sarvamai/skillsGitHub Stars
41
First Seen
Feb 8, 2026
Security Audits
Installed on
kimi-cli15
gemini-cli15
amp15
github-copilot15
codex15
opencode15