ollama
SKILL.md
Ollama
Ollama makes running LLMs locally as easy as docker run. 2025 updates include Windows/AMD support, Multimodal input, and Tool Calling.
When to Use
- Local Development: Coding without wifi or API costs.
- Privacy: Processing sensitive documents on-device.
- Integration: Works with LangChain, LlamaIndex, and Obsidian natively.
Core Concepts
Modelfile
Docker-like file to define a custom model (System prompt + Base model).
FROM llama3
SYSTEM You are Mario from Super Mario Bros.
API
Ollama runs a local server (localhost:11434) compatible with OpenAI SDK.
Best Practices (2025)
Do:
- Use high-speed RAM: Local LLM speed depends on memory bandwidth.
- Use Quantized Models:
q4_k_mis the sweet spot for speed/quality balance. - Unload:
ollama stopwhen done to free VRAM for games/rendering.
Don't:
- Don't expect GPT-4 level: Smaller local models (8B) are smart but lack deep reasoning.
References
Weekly Installs
3
Repository
g1joshi/agent-skillsGitHub Stars
7
First Seen
Feb 10, 2026
Security Audits
Installed on
opencode3
gemini-cli3
claude-code3
mcpjam2
kilo2
zencoder2