vm-codebase-indexer
Codebase Indexer Skill
Purpose
This skill builds a local semantic index of a codebase. Once indexed, the agent can run fast semantic searches over millions of lines of code without blowing up the context window.
Step 1 β Confirmation Popup (MANDATORY, always do this first)
Before doing ANYTHING else, use the ask_user_input_v0 tool to ask the user:
question: "ποΈ Ready to index your codebase? This will scan all files and build
a local semantic search index. It may take a minute depending on project size."
type: single_select
options:
- "YES! Let's index it π"
- "Not yet"
- If the user picks "Not yet" β stop immediately, say something friendly like "No worries! Come back when you're ready." Do not proceed further.
- If the user picks "YES! Let's index it π" β continue to Step 2.
Step 2 β Determine the Codebase Path
Ask the user (in plain text, not a widget) for the root path of the codebase
they want to index. If the current working directory is clearly the project root
(e.g. there's a package.json, pyproject.toml, Cargo.toml, or .git in
it), suggest that path as the default and confirm with the user before using it.
Store the confirmed path as CODEBASE_PATH.
Step 3 β Install Dependencies
Run the following to ensure required packages are available:
pip install chromadb sentence-transformers --break-system-packages -q
If the install fails, report the error to the user and stop.
Step 4 β Run the Indexer
Execute the indexer script:
python3 /path/to/skill/scripts/index.py --path "<CODEBASE_PATH>"
Replace /path/to/skill/ with the actual location of this skill's scripts folder.
The script will:
- Walk the directory tree (respecting
.gitignoreand skipping common noise dirs) - Chunk each file into meaningful segments
- Embed each chunk using a local embedding model (
all-MiniLM-L6-v2) - Persist everything to a ChromaDB database at
~/.codebase-index/<project-name>/
While it runs, let the user know it's working. The script prints progress to stdout so you can relay updates.
If the script exits with a non-zero code, show the error and stop.
Step 5 β Confirm Success
When the script finishes, it prints a JSON summary line like:
{"status": "done", "files": 142, "chunks": 891, "db_path": "~/.codebase-index/my-project"}
Parse this and report back to the user in a friendly way, e.g.:
β Done! Indexed 142 files across 891 chunks. The index is saved at
~/.codebase-index/my-projectand ready to search.
Step 6 β How to Search the Index (after indexing)
Whenever you need to find relevant code during a task, use the search script:
python3 /path/to/skill/scripts/search.py --db "<DB_PATH>" --query "<your query>" --results 5
This returns the top N most semantically relevant code chunks as JSON. Read them and use their content to answer the user's question. Always prefer searching the index over reading entire files.
Important Rules
- Always show the confirmation popup first. Never skip Step 1.
- Never index without the user's explicit YES.
- The index persists between sessions β if one already exists at the same path, the indexer will update it (add new/changed files, skip unchanged ones).
- Respect
.gitignore. Never indexnode_modules,.git,__pycache__,dist,build,.next,venv,.env, or binary files. - The embedding model runs locally β no data leaves the machine.