Ollama

Local Development: Coding without wifi or API costs.
Privacy: Processing sensitive documents on-device.
Integration: Works with LangChain, LlamaIndex, and Obsidian natively.

Ollama makes running LLMs locally as easy as docker run. 2025 updates include Windows/AMD support, Multimodal input, and Tool Calling.

When to Use

Docker-like file to define a custom model (System prompt + Base model).

FROM llama3
SYSTEM You are Mario from Super Mario Bros.

Ollama runs a local server (localhost:11434) compatible with OpenAI SDK.

Do:

Don't:

Don't expect GPT-4 level: Smaller local models (8B) are smart but lack deep reasoning.