chroma-local
Instructions
Determine these before writing code. Prefer discovering them from the repo and the user request. Ask only when the choice materially changes the implementation.
-
Runtime shape
- Are they connecting to a running local server, embedding Chroma into tests, or setting up local development from scratch?
- Decide whether they need
chroma run, a Docker or service command,HttpClientorChromaClient, or PythonEphemeralClient.
-
Persistence
- Persistent local data: choose an intentional data path.
- Disposable test data: use defaults or a temp directory.
-
Embedding model
- Reuse the app's existing embedding provider when possible.
- Otherwise default to
@chroma-core/default-embedin TypeScript or the standard local default in Python. - If the user explicitly wants OpenAI embeddings in TypeScript, install and use
@chroma-core/openai.
-
Indexed data shape
- Determine what is being indexed, how it should be chunked, and what metadata is needed for filtering and updates.
Routing
-
Existing local server
- Confirm host and port before changing client code.
- Validate the server is reachable before assuming collections are missing.
-
Fresh local development
- Add a local startup path such as
chroma runor the repo's existing Docker or service command. - Default to
localhost:8000unless the repo already uses another address.
- Add a local startup path such as
-
Python tests or disposable local workflows
- Prefer
EphemeralClientwhen persistence is unnecessary. - Call out that data is lost when the process exits.
- Prefer
-
Persistent local development
- Use a stable data path and make persistence explicit in code or config.
- Do not silently switch between ephemeral and persistent modes.
-
Search integration work
- Use
getOrCreateCollection()in TypeScript orget_or_create_collection()in Python. - Design document IDs and metadata so upserts and deletes are straightforward.
- Batch writes when syncing large datasets.
- Use
Ask vs proceed
Ask first:
- Embedding model choice (cost and quality implications)
- Whether they need persistent local data
- How they are starting the local server
- Multi-tenant data isolation strategy
Proceed with sensible defaults:
- Use
getOrCreateCollection()(TypeScript) /get_or_create_collection()(Python) - Use cosine similarity (most common)
- Chunk size under 8KB
- Store source IDs in metadata for updates/deletes
- Use a local server on
localhost:8000unless the repo already configures another address or is using PythonEphemeralClient
What to validate
- Correct client import (
ChromaClient,HttpClient, orClient) - Embedding function package is installed (TypeScript)
- Local server is reachable before assuming collections are missing
- Local path and persistence mode are intentional
Implementation notes
- Local Chroma is the right default for development, tests, and self-hosted deployments.
- OSS Chroma does not include Chroma Cloud-only features such as
Schema()andSearch(). - If the user asks for hybrid dense and sparse retrieval, treat that as a likely Chroma Cloud requirement unless the repo already implements an OSS workaround.
- For open source Chroma, dense retrieval with a single embedding function is the normal baseline.
Minimal patterns
Start a local Chroma server when the repo needs one:
chroma run
Default address: localhost:8000.
TypeScript local client:
import { ChromaClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
const client = new ChromaClient();
const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
name: 'my_collection',
embeddingFunction,
});
// Add documents
await collection.add({
ids: ['doc1', 'doc2'],
documents: ['First document text', 'Second document text'],
});
// Query
const results = await collection.query({
queryTexts: ['search query'],
nResults: 5,
});
Python local client:
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection(name="my_collection")
# Add documents
collection.add(
ids=["doc1", "doc2"] ,
documents=["First document text", "Second document text"],
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5,
)
Learn More
Fetch Chroma's llms.txt only when you need API or product details that are not already in the repo or this skill: https://docs.trychroma.com/llms.txt
Available Topics
Typescript
- Chroma Regex Filtering - Learn how to use regex filters in Chroma queries
- Query and Get - Query and Get Data from Chroma Collections
- Metadata - Store and query metadata, including filters and array values
- Updating and Deleting - Update existing documents and delete data from collections
- Error Handling - Handling errors and failures when working with Chroma
- Local Chroma - How to run and use local chroma
Python
- Chroma Regex Filtering - Learn how to use regex filters in Chroma queries
- Query and Get - Query and Get Data from Chroma Collections
- Metadata - Store and query metadata, including filters and array values
- Updating and Deleting - Update existing documents and delete data from collections
- Error Handling - Handling errors and failures when working with Chroma
- Local Chroma - How to run and use local chroma
General
- Data Model - An overview of how Chroma stores data
- Integrating Chroma into an existing system - Guidance for adding Chroma search to an existing application
- Chroma CLI - Starting and managing a local open source Chroma server from the CLI
More from chroma-core/agent-skills
chroma
Provides expertise on Chroma vector database integration for semantic search applications. Use when the user asks about vector search, embeddings, Chroma, semantic search, RAG systems, nearest neighbor search, or adding search functionality to their application.
113chroma-cloud
Provides expertise on Chroma Cloud integration for semantic search and hybrid search applications. Use when the user is working with Chroma Cloud, CloudClient, managed collections, Schema(), Search(), hybrid search, or Chroma Cloud CLI workflows.
13