Local LLM Ops (Ollama)

Overview

Your localLLM repo provides a full local LLM toolchain on Apple Silicon: setup scripts, a rich CLI chat launcher, benchmarks, and diagnostics. The operational path is: install Ollama, ensure the service is running, initialize the venv, pull models, then launch chat or benchmarks.

Quick Start

./setup_chatbot.sh
./chatllm

If no models are present:

ollama pull mistral

Setup Checklist

Install Ollama: brew install ollama
Start the service: brew services start ollama
Run setup: ./setup_chatbot.sh
Verify service: curl http://localhost:11434/api/version

Chat Launchers

./chatllm (primary launcher)
./chat or ./chat.py (alternate launchers)
Aliases: ./install_aliases.sh then llm, llm-code, llm-fast

Task modes:

./chat -t coding -m codellama:70b
./chat -t creative -m llama3.1:70b
./chat -t analytical

Benchmark Workflow

Benchmarks are scripted in scripts/run_benchmarks.sh:

./scripts/run_benchmarks.sh

This runs bench_ollama.py with:

benchmarks/prompts.yaml
benchmarks/models.yaml
Multiple runs and max token limits

Diagnostics

Run the built-in diagnostic script when setup fails:

./diagnose.sh

Common fixes:

Re-run ./setup_chatbot.sh
Ensure ollama is in PATH
Pull at least one model: ollama pull mistral

Operational Notes

Virtualenv lives in .venv
Chat configs and sessions live under ~/.localllm/
Ollama API runs at http://localhost:11434

Related Skills

toolchains/universal/infrastructure/docker

local-llm-ops