deepseek-integration
DeepSeek Integration
Use When
- Integrate DeepSeek AI models into apps — DeepSeek V3 (general), R1 (reasoning/CoT), API setup (OpenAI-compatible), local deployment with Ollama, distilled model selection, cost vs OpenAI comparison, and Python/JavaScript code patterns. Use when...
- The task needs reusable judgment, domain constraints, or a proven workflow rather than ad hoc advice.
Do Not Use When
- The task is unrelated to
deepseek-integrationor would be better handled by a more specific companion skill. - The request only needs a trivial answer and none of this skill's constraints or references materially help.
Required Inputs
- Gather relevant project context, constraints, and the concrete problem to solve.
- Confirm the desired deliverable: design, code, review, migration plan, audit, or documentation.
Workflow
- Read this
SKILL.mdfirst, then load only the referenced deep-dive files that are necessary for the task. - Apply the ordered guidance, checklists, and decision rules in this skill instead of cherry-picking isolated snippets.
- Produce the deliverable with assumptions, risks, and follow-up work made explicit when they matter.
Quality Standards
- Keep outputs execution-oriented, concise, and aligned with the repository's baseline engineering standards.
- Preserve compatibility with existing project conventions unless the skill explicitly requires a stronger standard.
- Prefer deterministic, reviewable steps over vague advice or tool-specific magic.
Anti-Patterns
- Treating examples as copy-paste truth without checking fit, constraints, or failure modes.
- Loading every reference file by default instead of using progressive disclosure.
Outputs
- A concrete result that fits the task: implementation guidance, review findings, architecture decisions, templates, or generated artifacts.
- Clear assumptions, tradeoffs, or unresolved gaps when the task cannot be completed from available context alone.
- References used, companion skills, or follow-up actions when they materially improve execution.
Evidence Produced
| Category | Artifact | Format | Example |
|---|---|---|---|
| Correctness | DeepSeek integration test plan | Markdown doc covering V3 (general), R1 (reasoning/CoT), and OpenAI-compatible API contract tests | docs/ai/deepseek-tests.md |
| Security | DeepSeek key handling note | Markdown doc covering API key storage, rotation, and per-environment configuration | docs/ai/deepseek-key-handling.md |
References
- Use the links and companion skills already referenced in this file when deeper context is needed.
DeepSeek provides high-performance LLMs at a fraction of the cost of OpenAI. The API is fully OpenAI-compatible — swap the base URL and model name, everything else is identical.
Model Selection
| Model | Use Case | Context | Notes |
|---|---|---|---|
deepseek-chat |
General chat, code, reasoning | 128K | DeepSeek V3 — fast and cheap |
deepseek-reasoner |
Complex reasoning, math, science | 128K | DeepSeek R1 — slow but powerful |
deepseek-r1-distill-qwen-32b |
Local/self-hosted reasoning | 128K | Good balance of size vs quality |
deepseek-r1-distill-llama-70b |
Local/self-hosted, best quality | 128K | Rivals o1-mini |
deepseek-r1-distill-qwen-7b |
Edge/on-device | 128K | Smallest usable reasoning model |
Quick rule:
- General tasks →
deepseek-chat(V3) - Math, science, complex reasoning →
deepseek-reasoner(R1) - Local/private deployment →
deepseek-r1-distill-llama-70bvia Ollama
API Setup — Cloud (api.deepseek.com)
DeepSeek API is OpenAI-compatible. Use the OpenAI SDK with a different base URL.
pip install openai
export DEEPSEEK_API_KEY="sk-..."
from openai import OpenAI
client = OpenAI(
api_key=os.environ["DEEPSEEK_API_KEY"],
base_url="https://api.deepseek.com/v1",
)
response = client.chat.completions.create(
model="deepseek-chat", # or "deepseek-reasoner"
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum entanglement simply."},
],
max_tokens=1024,
temperature=0.7,
)
print(response.choices[0].message.content)
Streaming
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Write a short story about a robot."}],
stream=True,
max_tokens=2048,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
JavaScript/Node.js (OpenAI SDK)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.DEEPSEEK_API_KEY,
baseURL: "https://api.deepseek.com/v1",
});
const response = await client.chat.completions.create({
model: "deepseek-chat",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is the capital of Uganda?" },
],
max_tokens: 512,
});
console.log(response.choices[0].message.content);
Function Calling (Tool Use)
Identical to OpenAI tool use syntax:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
}
]
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "What's the weather in Kampala?"}],
tools=tools,
tool_choice="auto",
)
DeepSeek R1 — Reasoning Model
R1 uses Chain-of-Thought (CoT) internally. It "thinks" before answering, making it ideal for:
- Complex math and science problems
- Multi-step logical reasoning
- Code debugging and generation
- Data analysis
# R1 returns a reasoning_content field alongside the answer
response = client.chat.completions.create(
model="deepseek-reasoner",
messages=[
{"role": "user", "content": "Solve: If 3x + 7 = 22, what is x?"}
],
)
# Access the chain-of-thought reasoning (R1 specific)
reasoning = response.choices[0].message.reasoning_content
answer = response.choices[0].message.content
print(f"Reasoning: {reasoning}")
print(f"Answer: {answer}")
Note: reasoning_content only exists in deepseek-reasoner (R1) responses.
Local Deployment with Ollama
Run DeepSeek models locally — fully private, no API costs.
# Install Ollama (Windows/Mac/Linux)
# https://ollama.ai/download
# Pull a DeepSeek model
ollama pull deepseek-r1:7b # small, fast
ollama pull deepseek-r1:14b # balanced
ollama pull deepseek-r1:70b # best quality (requires 48GB+ RAM)
ollama pull deepseek-v2.5 # V2.5 general model
# Verify
ollama list
ollama run deepseek-r1:7b "Explain machine learning in one sentence"
Use Ollama via OpenAI SDK
# Ollama exposes an OpenAI-compatible endpoint at localhost:11434
client = OpenAI(
api_key="ollama", # any non-empty string
base_url="http://localhost:11434/v1",
)
response = client.chat.completions.create(
model="deepseek-r1:7b", # match the ollama pull name
messages=[{"role": "user", "content": "What is 17 × 19?"}],
)
print(response.choices[0].message.content)
Ollama in Node.js
const client = new OpenAI({
apiKey: "ollama",
baseURL: "http://localhost:11434/v1",
});
const response = await client.chat.completions.create({
model: "deepseek-r1:7b",
messages: [{ role: "user", content: "Summarise this text: ..." }],
});
Cost Comparison
DeepSeek V3 API is dramatically cheaper than OpenAI:
| Provider | Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|---|
| DeepSeek | V3 (deepseek-chat) | ~$0.27 | ~$1.10 |
| DeepSeek | R1 (deepseek-reasoner) | ~$0.55 | ~$2.19 |
| OpenAI | GPT-4o | ~$2.50 | ~$10.00 |
| OpenAI | o1 | ~$15.00 | ~$60.00 |
| Anthropic | Claude Sonnet 4.6 | ~$3.00 | ~$15.00 |
DeepSeek V3 is ~10× cheaper than GPT-4o for equivalent tasks.
Training cost: DeepSeek R1 was trained for $5.58M vs OpenAI's estimated $6B+.
Provider Abstraction (Use DeepSeek + OpenAI Interchangeably)
import os
from openai import OpenAI
def get_client(provider: str = "deepseek") -> tuple[OpenAI, str]:
if provider == "deepseek":
return OpenAI(
api_key=os.environ["DEEPSEEK_API_KEY"],
base_url="https://api.deepseek.com/v1",
), "deepseek-chat"
elif provider == "openai":
return OpenAI(api_key=os.environ["OPENAI_API_KEY"]), "gpt-4o"
elif provider == "ollama":
return OpenAI(
api_key="ollama",
base_url="http://localhost:11434/v1",
), "deepseek-r1:7b"
raise ValueError(f"Unknown provider: {provider}")
def chat(message: str, provider: str = "deepseek") -> str:
client, model = get_client(provider)
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": message}],
max_tokens=1024,
)
return response.choices[0].message.content
In Next.js / Vercel AI SDK
import { createOpenAI } from "@ai-sdk/openai";
import { streamText } from "ai";
const deepseek = createOpenAI({
apiKey: process.env.DEEPSEEK_API_KEY!,
baseURL: "https://api.deepseek.com/v1",
});
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: deepseek("deepseek-chat"), // or "deepseek-reasoner"
messages,
maxTokens: 2048,
});
return result.toDataStreamResponse();
}
DeepSeek V3 Architecture Notes
- Mixture of Experts (MoE) — 671B total parameters, only 37B activated per forward pass
- 128K context window — handles long documents and conversations
- OpenAI-compatible API — drop-in replacement for any OpenAI code
- MIT License for R1 — free for commercial use and modification
- Distilled models — smaller versions (1.5B–70B) that preserve R1 reasoning quality
Anti-Patterns
| Anti-Pattern | Fix |
|---|---|
| Using R1 for simple tasks | Use V3 (deepseek-chat) for general tasks — R1 is slower and pricier |
| Hardcoding DeepSeek API key | Use environment variables |
| Not handling rate limits | DeepSeek has lower rate limits than OpenAI — add retry with backoff |
Expecting reasoning_content from V3 |
Only R1 (deepseek-reasoner) returns chain-of-thought reasoning |
| Running 70B model on inadequate hardware | Check VRAM: 7B needs 6GB, 70B needs 48GB+ |
Sources: Aremu — DeepSeek AI from Beginner to Paid Professional (2025); Chakraborty — DeepSeek AI: A Comprehensive Guide (2025); Kits For Life — Mastering DeepSeek-v3 (2025)
More from peterbamuhigire/skills-web-dev
google-play-store-review
Google Play Store compliance and review readiness for Android apps. Use
77multi-tenant-saas-architecture
Use when designing or reviewing a multi-tenant SaaS platform — tenant
68jetpack-compose-ui
Jetpack Compose UI standards for beautiful, sleek, minimalistic Android
49gis-mapping
Use for web apps that need Leaflet-first GIS mapping, location selection,
49saas-accounting-system
Implement a complete double-entry accounting system inside any SaaS app.
47manual-guide
Generate end-user manuals and reference guides for ERP modules. Use when
40