zvec

SKILL.md

Zvec Vector Database Expert

Expert in Zvec vector database development for storing, indexing, and searching vector embeddings with scalar filtering.

When to Use

  • Creating or managing Zvec collections and schemas
  • Inserting, upserting, updating, or deleting documents with vector embeddings
  • Performing vector similarity search (single-vector, multi-vector, hybrid)
  • Choosing and configuring vector indexes (Flat, HNSW, IVF)
  • Setting up embedding functions (dense, sparse, custom)
  • Optimizing query performance and collection indexing
  • Implementing reranking strategies for search results
  • Schema evolution (adding/dropping fields, altering types, managing indexes)

Quick Reference

Topic Reference
Quickstart Guide Installation, basic CRUD, and first search
Collections Create, open, inspect, destroy, optimize
Data Operations Insert, upsert, update, delete, fetch
Query & Search Single-vector, multi-vector, filter, hybrid
Concepts & Indexing Data modeling, vector types, index strategies
Embedding & Reranking Embedding functions, rerankers, custom implementations

Installation

Python:

pip install zvec

Node.js:

npm install @zvec/zvec

Core Workflow

import zvec

# 1. Optional global config
zvec.init(log_level=zvec.LogLevel.WARN, query_threads=4)

# 2. Define schema
schema = zvec.CollectionSchema(
    name="my_collection",
    fields=[
        zvec.FieldSchema(name="category", data_type=zvec.DataType.ARRAY_STRING,
                         index_param=zvec.InvertIndexParam()),
        zvec.FieldSchema(name="year", data_type=zvec.DataType.INT32,
                         index_param=zvec.InvertIndexParam(enable_range_optimization=True)),
    ],
    vectors=[
        zvec.VectorSchema(name="embedding", dimension=768,
                          index_param=zvec.HnswIndexParam(metric_type=zvec.MetricType.COSINE)),
    ],
)

# 3. Create collection
collection = zvec.create_and_open(path="/path/to/collection", schema=schema)

# 4. Insert documents
collection.insert(zvec.Doc(
    id="doc_1",
    vectors={"embedding": [0.1] * 768},
    fields={"category": ["AI", "ML"], "year": 2024},
))

# 5. Optimize for search performance
collection.optimize()

# 6. Query
results = collection.query(
    vectors=zvec.VectorQuery(field_name="embedding", vector=[0.3] * 768),
    filter="year >= 2020",
    topk=10,
)

Index Selection Guide

Index Best For Trade-off
Flat Small datasets, prototyping, exact recall Linear scan — slow at scale
HNSW Production, low-latency, high recall (recommended default) Higher memory footprint
IVF Large datasets with natural clustering Requires parameter tuning

Data Types

Scalar: STRING, BOOL, INT32, INT64, UINT32, UINT64, FLOAT, DOUBLE (+ ARRAY_ variants) Dense vectors: VECTOR_FP32, VECTOR_FP16, VECTOR_INT8 Sparse vectors: SPARSE_VECTOR_FP32, SPARSE_VECTOR_FP16

Key Patterns

  • Call optimize() after batch inserts (~100k+ docs) to merge flat buffer into configured index
  • Use upsert() instead of insert() when document IDs may already exist
  • Index scalar fields you plan to filter on; skip indexing display-only fields
  • Use HnswIndexParam as default index for production workloads
  • Combine dense + sparse vectors with WeightedReRanker or RrfReRanker for hybrid search
  • Monitor collection.stats.index_completeness to track indexing progress
  • Schema is dynamic — add/drop scalar fields without recreating the collection
Weekly Installs
3
Repository
0xkynz/codekit
GitHub Stars
1
First Seen
12 days ago
Installed on
opencode3
antigravity3
claude-code3
github-copilot3
codex3
zencoder3