valkey-search-dev
Valkey Search Module - Contributor Reference
Routing
- HNSW graph, ef_construction, ef_runtime, M parameter, ANN search -> Indexes (hnsw)
- FLAT index, brute-force, exact KNN, block-size growth -> Indexes (flat)
- Numeric range queries, BTreeNumeric, SegmentTree, EntriesFetcher -> Indexes (numeric)
- Tag filtering, PatriciaTree, separator, prefix wildcard, case sensitivity -> Indexes (tag)
- Full-text search, Rax trees, postings, stemming, proximity, phrase, fuzzy -> Indexes (text)
- Module loading, ValkeySearch singleton, VMSDK, startup sequence -> Architecture (module-overview)
- IndexSchema class, attributes, keyspace mutations, backfill, sequence numbers -> Architecture (index-schema)
- SchemaManager, index CRUD, staging, FlushDB/SwapDB, RDB load -> Architecture (schema-manager)
- Thread pools, TimeSlicedMRMWMutex, fork suspension, concurrency -> Architecture (thread-model)
- Filter expressions, predicate AST, QueryOperations bitmask, safety limits -> Query (parsing)
- Prefilter vs inline filtering, async dispatch, content resolution -> Query (execution)
- FT.SEARCH handler, RETURN/LIMIT/SORTBY, response serialization -> Query (ft-search)
- FT.AGGREGATE pipeline, GROUPBY/REDUCE, APPLY, expression engine -> Query (ft-aggregate)
- gRPC coordinator, cluster topology, metadata sync, fingerprinting -> Cluster (coordinator)
- RDB protobuf format, SafeRDB, FT.INTERNAL_UPDATE, replication staging -> Cluster (replication)
- FT.INFO fanout, FT._DEBUG subcommands, metrics counters, latency samplers -> Cluster (metrics)
- Building from source, CMake, Ninja, build.sh, dependencies -> Build (build)
- Unit tests, integration tests, pytest, GoogleTest, stability tests -> Build (testing)
- CI workflows, Docker CI, pre-built debs, debugging CI failures -> Build (ci-pipeline)
- Directory layout, IndexBase hierarchy, adding features, VMSDK -> Build (code-structure)
- Query engine overview, hybrid search, pre-filter architecture -> Query (execution)
- Cluster-mode search, shard fanout, coordinator port -> Cluster (coordinator)
- Adding new index types, RDB callbacks, command registration -> Build (code-structure)
- Performance, contention checking, writer suspension, cron jobs -> Architecture (thread-model)
- Vector similarity, VectorBase, embedding storage, distance metrics -> Indexes (hnsw)
- Schema mutations, replication staging protocol, chunk streaming -> Cluster (replication)
- Observability, adding metrics, latency sampling -> Cluster (metrics)
- Expression engine, Record types, RecordSet, reducers -> Query (ft-aggregate)
- Index architecture overview, shard design -> Architecture (module-overview)
- Build with sanitizers, ASAN, TSAN, Valgrind -> Build (build)
Quick Start
# Build
./build.sh --configure
# Run all tests
./build.sh --run-tests
# Run a single test suite
./build.sh --run-tests=vector_test
# Load the module
valkey-server --loadmodule .build-release/libsearch.so
# Build with sanitizers
./build.sh --configure --asan
./build.sh --configure --tsan
Critical Rules
- C++17 codebase - uses std::variant, std::optional, structured bindings throughout
- VMSDK abstraction - never call ValkeyModule_* directly; use the VMSDK wrapper layer
- TimeSlicedMRMWMutex - all index mutations must hold the correct lock; background threads yield to fork
- Protobuf RDB format - index metadata serializes via protobuf, not raw binary; see SafeRDB for backward compat
- Tests are non-negotiable - unit tests (GoogleTest) for internals, pytest integration tests for commands
- gRPC coordinator - cluster mode uses gRPC for metadata sync; never bypass the coordinator protocol
Architecture
| Topic | Reference |
|---|---|
| Module loading, ValkeySearch singleton, VMSDK, thread pools, config | module-overview |
| IndexSchema class, attribute map, keyspace mutations, backfill | index-schema |
| SchemaManager singleton, index CRUD, replication staging, RDB | schema-manager |
| Thread pools, TimeSlicedMRMWMutex, fork suspension, concurrency | thread-model |
Index Types
| Topic | Reference |
|---|---|
| HNSW graph index, VectorHNSW, hnswlib, ef/M params, inline filtering | hnsw |
| FLAT brute-force index, VectorFlat, block-size growth, exact KNN | flat |
| Numeric index, BTreeNumeric, SegmentTree overlay, range queries | numeric |
| Tag index, PatriciaTree storage, separator, prefix wildcard matching | tag |
| Full-text search, Rax prefix/suffix trees, postings, stemming, fuzzy | text |
Query Engine
| Topic | Reference |
|---|---|
| Filter expression parser, predicate AST, QueryOperations, safety limits | parsing |
| Search execution, prefilter vs inline, async dispatch, content resolution | execution |
| FT.SEARCH handler, parameter parsing, RETURN/LIMIT/SORTBY, response format | ft-search |
| FT.AGGREGATE pipeline, GROUPBY/REDUCE, APPLY, expression engine, Records | ft-aggregate |
Cluster and Replication
| Topic | Reference |
|---|---|
| gRPC coordinator, cluster topology, MetadataManager, reconciliation | coordinator |
| RDB protobuf format, SafeRDB, FT.INTERNAL_UPDATE, replication staging | replication |
| FT.INFO fanout, FT._DEBUG subcommands, Metrics singleton, latency samplers | metrics |
Build and Contributing
| Topic | Reference |
|---|---|
| CMake build system, Ninja, dependencies, build.sh options, sanitizers | build |
| Unit tests (GoogleTest), integration tests (pytest), stability tests | testing |
| CI workflows, Docker-based CI, pre-built debs, debugging CI failures | ci-pipeline |
| Directory layout, IndexBase hierarchy, adding features, VMSDK layer | code-structure |
More from avifenesh/valkey-skills
valkey
Use when building apps with Valkey - caching, sessions, queues, locks, rate-limiting, leaderboards, counters, pub-sub, streams, scripting. Covers IFEQ/DELIFEQ, hash field TTL, COMMANDLOG. Not for server internals (valkey-dev) or ops (valkey-ops).
5valkey-dev
Use when contributing to the Valkey server - C internals, event loop, commands, data structures, cluster, replication, RDB/AOF, memory, threading, modules, Lua, RESP, tests. Not for app development (valkey) or ops (valkey-ops).
5valkey-ops
Use when deploying, configuring, monitoring, or troubleshooting self-hosted Valkey. Covers Sentinel, cluster, persistence, replication, security, Kubernetes, performance tuning. Not for app development (valkey) or server internals (valkey-dev).
5valkey-ecosystem
Use when evaluating the Valkey ecosystem - client libraries, modules (JSON, Bloom, Search), managed services (AWS, GCP, Aiven), monitoring tools, frameworks (Spring, Django, Rails), Docker/Kubernetes deployment, CI/CD patterns, migration from Redis, and developer tooling.
5glide-mq
Use when building message queues with glide-mq. Covers queue setup, producer/consumer patterns, job scheduling, workflows, batch processing, streaming, and suspend/resume. Not for migrating from BullMQ (migrate-bullmq) or Bee-Queue (migrate-bee).
4valkey-glide
Router for Valkey GLIDE per-language skills. Use when you need to find the right language-specific GLIDE skill or migration skill. Not for GLIDE library internals or contributing to GLIDE source code - use glide-dev instead.
4