duckdb-docs

Installation
SKILL.md

You are helping the user find relevant DuckDB or DuckLake documentation.

Query: $@

Follow these steps in order.

Step 1 — Check DuckDB is installed

command -v duckdb

If not found, delegate to /duckdb-skills:install-duckdb and then continue.

Step 2 — Ensure required extensions are installed

duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"

If this fails, report the error and stop.

Step 3 — Choose the data source and extract search terms

The query is: $@

Data source selection

There are two search indexes available:

Index Remote URL Local cache filename Versions Use when
DuckDB docs + blog https://duckdb.org/data/docs-search.duckdb duckdb-docs.duckdb lts, current, blog Default — any DuckDB question
DuckLake docs https://ducklake.select/data/docs-search.duckdb ducklake-docs.duckdb stable, preview Query mentions DuckLake, catalogs, or DuckLake-specific features

Both indexes share the same schema:

Column Type Description
chunk_id VARCHAR (PK) e.g. stable/sql/functions/numeric#absx
page_title VARCHAR Page title from front matter
section VARCHAR Section heading (null for page intros)
breadcrumb VARCHAR e.g. SQL > Functions > Numeric
url VARCHAR URL path with anchor
version VARCHAR See table above
text TEXT Full markdown of the chunk

By default, search DuckDB docs and filter to version = 'lts'. Use different versions when:

  • The user explicitly asks about current/nightly features → version = 'current'
  • The user asks about a blog post or wants background/motivation → version = 'blog'
  • The user asks about DuckLake → search the DuckLake index with version = 'stable'
  • When unsure, omit the version filter to search across all versions.

Search terms

If the input is a natural language question (e.g. "how do I find the most frequent value"), extract the key technical terms (nouns, function names, SQL keywords) to form a compact BM25 query string. Drop stop words like "how", "do", "I", "the".

If the input is already a function name or technical term (e.g. arg_max, GROUP BY ALL), use it as-is.

Use the extracted terms as SEARCH_QUERY in the next step.

Step 4 — Ensure local cache is fresh

The cache lives at $HOME/.duckdb/docs/CACHE_FILENAME (where CACHE_FILENAME is duckdb-docs.duckdb or ducklake-docs.duckdb per Step 3).

First, ensure the directory exists:

mkdir -p "$HOME/.duckdb/docs"

Then check whether the cache file exists and is fresh (≤2 days old):

CACHE_FILE="$HOME/.duckdb/docs/CACHE_FILENAME"
if [ -f "$CACHE_FILE" ]; then
    MTIME=$(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE")
    CACHE_AGE_DAYS=$(( ( $(date +%s) - MTIME ) / 86400 ))
else
    CACHE_AGE_DAYS=999
fi
echo "Cache age: $CACHE_AGE_DAYS days"

If CACHE_AGE_DAYS ≤ 2 → skip to Step 5.

Otherwise (stale or missing) → fetch the index:

duckdb -c "
LOAD httpfs;
LOAD fts;
ATTACH 'REMOTE_URL' AS remote (READ_ONLY);
ATTACH '$HOME/.duckdb/docs/CACHE_FILENAME.tmp' AS tmp;
COPY FROM DATABASE remote TO tmp;
" && mv "$HOME/.duckdb/docs/CACHE_FILENAME.tmp" "$HOME/.duckdb/docs/CACHE_FILENAME"

Replace REMOTE_URL and CACHE_FILENAME per Step 3. If the fetch fails (network error), report the error and stop.

Step 5 — Search the docs

duckdb "$HOME/.duckdb/docs/CACHE_FILENAME" -readonly -json -c "
LOAD fts;
SELECT
    chunk_id, page_title, section, breadcrumb, url, version, text,
    fts_main_docs_chunks.match_bm25(chunk_id, 'SEARCH_QUERY') AS score
FROM docs_chunks
WHERE score IS NOT NULL
  AND version = 'VERSION'
ORDER BY score DESC
LIMIT 8;
"

Replace CACHE_FILENAME, SEARCH_QUERY, and VERSION per Step 3. Remove the AND version = 'VERSION' line if searching across all versions.

If the user's question could benefit from both DuckDB docs and blog results, run two queries (one with version = 'stable', one with version = 'blog') or omit the version filter entirely.

Step 6 — Handle errors

  • Extension not installed (httpfs or fts not found): run duckdb :memory: -c "INSTALL httpfs; INSTALL fts;" and retry.
  • ATTACH fails / network unreachable: inform the user that the docs index is unavailable and suggest checking their internet connection. The DuckDB index is hosted at https://duckdb.org/data/docs-search.duckdb and the DuckLake index at https://ducklake.select/data/docs-search.duckdb.
  • No results (all scores NULL or empty result set): try broadening the query — drop the least specific term, or try a single-word version of the query — then retry Step 5. If still no results, tell the user no matching documentation was found and suggest visiting https://duckdb.org/docs or https://ducklake.select/docs directly.

Step 7 — Present results

For each result chunk returned (ordered by score descending), format as:

### {section} — {page_title}
{url}

{text}

---

After presenting all chunks, synthesize a concise answer to the user's original question ($@) based on the retrieved documentation. If the chunks directly answer the question, lead with the answer before showing the sources.

Weekly Installs
77
GitHub Stars
403
First Seen
Mar 20, 2026
Installed on
codex75
opencode73
kimi-cli71
github-copilot71
gemini-cli71
amp71