spice-ai
Add AI Capabilities
Spice integrates AI as a first-class runtime capability. Connect to hosted LLM providers or serve models locally, with an OpenAI-compatible API, tool use, text-to-SQL, and model routing — all configured in YAML.
Configure a Model
models:
- from: <provider>:<model_id>
name: <model_name>
params:
<provider>_api_key: ${ secrets:API_KEY }
tools: auto # optional: enable runtime tools
system_prompt: | # optional: default system prompt
You are a helpful assistant.
Supported Providers
| Provider | From Format | Status |
|---|---|---|
| OpenAI (or compatible) | openai:gpt-4o |
Stable |
| Anthropic | anthropic:claude-sonnet-4-5 |
Alpha |
| Azure OpenAI | azure:my-deployment |
Alpha |
| Google AI | google:gemini-pro |
Alpha |
| xAI | xai:grok-beta |
Alpha |
| Perplexity | perplexity:sonar-pro |
Alpha |
| Amazon Bedrock | bedrock:anthropic.claude-3 |
Alpha |
| Databricks | databricks:llama-3-70b |
Alpha |
| Spice.ai | spiceai:llama3 |
Release Candidate |
| HuggingFace | hf:meta-llama/Llama-3-8B-Instruct |
Release Candidate |
| Local file | file:./models/llama.gguf |
Release Candidate |
Using Models
Chat API (OpenAI-compatible)
Existing applications using OpenAI SDKs can swap endpoints without code changes:
curl http://localhost:8090/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt4",
"messages": [{"role": "user", "content": "Hello"}]
}'
CLI
spice chat
chat> How many orders were placed last month?
Text-to-SQL (NSQL)
The /v1/nsql endpoint converts natural language to SQL and executes it. Spice uses tools like table_schema, random_sample, and sample_distinct_columns to help models write accurate SQL:
curl -XPOST "http://localhost:8090/v1/nsql" \
-H "Content-Type: application/json" \
-d '{"query": "What was the highest tip any passenger gave?"}'
Tools (Function Calling)
Tools extend LLM capabilities with runtime functions:
Built-in Tools
| Tool | Description | Group |
|---|---|---|
list_datasets |
List available datasets | auto |
sql |
Execute SQL queries | auto |
table_schema |
Get table schema | auto |
search |
Vector similarity search | auto |
sample_distinct_columns |
Sample distinct column values | auto |
random_sample |
Random row sampling | auto |
top_n_sample |
Top N rows by ordering | auto |
memory:load |
Load stored memories | memory |
memory:store |
Store new memories | memory |
websearch |
Search the web | — |
Enable Tools
models:
- from: openai:gpt-4o
name: analyst
params:
openai_api_key: ${ secrets:OPENAI_API_KEY }
tools: auto # all default tools
# tools: sql, search # or specific tools only
Memory (Persistent Context)
datasets:
- from: memory:store
name: llm_memory
access: read_write
models:
- from: openai:gpt-4o
name: assistant
params:
tools: auto, memory
Web Search
tools:
- name: web
from: websearch
description: 'Search the web for information.'
params:
engine: perplexity
perplexity_auth_token: ${ secrets:PERPLEXITY_TOKEN }
models:
- from: openai:gpt-4o
name: researcher
params:
tools: auto, web
MCP Server Integration
tools:
- name: external_tools
from: mcp
params:
mcp_endpoint: http://localhost:3000/mcp
Tool Recursion Limit
models:
- from: openai:gpt-4o
name: my_model
params:
tool_recursion_limit: 3 # default: 10
Model Routing (Workers)
Workers coordinate traffic across multiple models for load balancing, fallback, and weighted routing. Workers are called with the same API as models.
Round Robin
workers:
- name: balanced
type: load_balance
description: Distribute requests evenly.
load_balance:
routing:
- from: model_a
- from: model_b
Fallback (Priority Order)
workers:
- name: fallback
type: load_balance
description: Try GPT-4o first, fall back to Claude.
load_balance:
routing:
- from: gpt4
order: 1
- from: claude
order: 2
Weighted Distribution
workers:
- name: weighted
type: load_balance
description: Route 80% to fast model.
load_balance:
routing:
- from: fast_model
weight: 4 # 80%
- from: slow_model
weight: 1 # 20%
Model Examples
OpenAI with Tools
models:
- from: openai:gpt-4o
name: gpt4
params:
openai_api_key: ${ secrets:OPENAI_API_KEY }
tools: auto
OpenAI-Compatible Provider (e.g., Groq)
models:
- from: openai:llama3-groq-70b-8192-tool-use-preview
name: groq-llama
params:
endpoint: https://api.groq.com/openai/v1
openai_api_key: ${ secrets:GROQ_API_KEY }
With System Prompt and Parameter Overrides
models:
- from: openai:gpt-4o
name: pirate_haikus
params:
system_prompt: |
Write everything in Haiku like a pirate.
openai_temperature: 0.1
openai_response_format: "{ 'type': 'json_object' }"
Local Model (GGUF)
models:
- from: file:./models/llama-3.gguf
name: local_llama
Evals
Evaluate model performance:
evals:
- name: accuracy_test
description: Verify model understands the data.
dataset: test_data
scorers:
- Match
Documentation
More from spiceai/skills
spice-data-connector
Configure individual data source connectors in Spice — PostgreSQL, MySQL, S3, Databricks, Snowflake, DuckDB, GitHub, Kafka, and 25+ more. Use this skill whenever the user wants to add a dataset, connect to a specific database or data source, load data from S3 or files, configure connector-specific parameters, understand file formats (Parquet, CSV, PDF, DOCX), or set up hive partitioning. This skill is the reference for the `from:` and `params:` fields in dataset configuration. For cross-source federation, views, and catalogs, see spice-connect-data.
22spice-models
Configure AI/LLM model providers and connections in Spice — OpenAI, Anthropic, Azure, Google, xAI, Bedrock, Perplexity, Databricks, HuggingFace, and local GGUF models. Use this skill whenever the user wants to add a model, configure a specific LLM provider, set up an OpenAI-compatible endpoint (e.g. Groq, Ollama), serve a local model, configure system prompts, set parameter overrides (temperature, response format), or understand which providers are available. This skill is the model connector reference. For AI features like tools, memory, workers, and NSQL, see spice-ai.
16spicepod-config
Create and configure Spicepod manifests (spicepod.yaml) — the central configuration file for Spice applications. Use this skill whenever the user wants to create a new spicepod.yaml from scratch, understand the overall spicepod structure and available sections, configure runtime settings (ports, caching, telemetry/observability), set up a complete Spice application combining datasets + models + search, or understand deployment models and use cases. This is the "glue" skill that shows how all Spice components fit together in one manifest. For details on specific sections (datasets, models, search, etc.), see the dedicated skills.
16spice-accelerators
Choose and configure the right acceleration engine — Arrow, DuckDB, SQLite, Cayenne, PostgreSQL, or Turso. Use this skill whenever the user needs to pick an accelerator engine, compare engines (e.g. "should I use DuckDB or Cayenne?"), configure engine-specific parameters (duckdb_file, sqlite_file), tune memory vs file mode, or understand engine capabilities and limitations. This skill is the engine selection and tuning guide. For the broader acceleration feature (refresh modes, retention, snapshots, indexes), see spice-acceleration.
15spice-secrets
Configure secret stores in Spice — environment variables, Kubernetes, AWS Secrets Manager, and OS keyring. Use this skill whenever the user needs to manage credentials, API keys, passwords, or tokens in Spice, reference secrets in spicepod.yaml params with ${ store:KEY } syntax, set up .env files, configure secret store precedence, or understand how the `secrets:` section works. Also use when the user asks how to pass database passwords or API keys securely to Spice datasets or models.
12spice-acceleration
Accelerate data locally for sub-second query performance — the feature and its configuration. Use this skill whenever the user asks about data acceleration concepts, enabling acceleration on a dataset, choosing refresh modes (full, append, changes, caching), configuring retention policies, setting up snapshots for cold-start, adding indexes and constraints, or understanding the difference between federated and accelerated queries. This skill covers the "what and why" of acceleration. For choosing which acceleration engine to use (Arrow vs DuckDB vs SQLite vs Cayenne), see spice-accelerators.
10