docs-manage
Docs Manage
Index, refresh, and remove library documentation in the local Grounded Docs store. These commands modify the index and produce plain-text status messages on stdout.
When to use
- A library is not yet indexed and you need its docs available for search.
- Documentation may be stale and you want to pull in updated pages.
- You want to remove a library or version from the index to free space.
Commands
scrape
Download and index documentation from a URL or local directory.
npx @arabold/docs-mcp-server@latest scrape <library> <url> [options]
| Flag | Alias | Default | Description |
|---|---|---|---|
--version <ver> |
-v |
Library version label | |
--max-pages <n> |
-p |
config default | Maximum pages to scrape |
--max-depth <n> |
-d |
config default | Maximum navigation depth |
--max-concurrency <n> |
-c |
config default | Concurrent page requests |
--ignore-errors |
true |
Continue on individual page errors | |
--scope subpages|hostname|domain |
subpages |
Crawling boundary | |
--follow-redirects |
true |
Follow HTTP redirects | |
--no-follow-redirects |
Disable following redirects | ||
--scrape-mode auto|fetch|playwright |
auto |
HTML processing strategy | |
--include-pattern <glob> |
URL include pattern (repeatable) | ||
--exclude-pattern <glob> |
URL exclude pattern (repeatable, takes precedence) | ||
--header "Name: Value" |
Custom HTTP header (repeatable) | ||
--embedding-model <model> |
Embedding model configuration | ||
--server-url <url> |
Remote pipeline worker URL | ||
--clean |
true |
Clear existing documents before scraping | |
--quiet |
Suppress non-error diagnostics | ||
--verbose |
Enable debug logging |
Examples:
# Scrape React docs, version-tagged
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react --version 19.0.0
# Scrape local files
npx @arabold/docs-mcp-server@latest scrape mylib file:///Users/me/docs/my-library
# Scrape with depth and page limits
npx @arabold/docs-mcp-server@latest scrape nextjs https://nextjs.org/docs --max-pages 200 --max-depth 3
# Scrape with custom headers (e.g. authentication)
npx @arabold/docs-mcp-server@latest scrape internal-api https://docs.internal.com \
--header "Authorization: Bearer tok_xxx"
# Exclude changelog pages
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react \
--exclude-pattern "**/changelog*"
Output is a plain-text status line, e.g. Successfully scraped 42 pages.
Progress updates appear on stderr during the run.
refresh
Re-scrape an existing library version, skipping unchanged pages via HTTP ETags.
npx @arabold/docs-mcp-server@latest refresh <library> [options]
| Flag | Alias | Description |
|---|---|---|
--version <ver> |
-v |
Version to refresh (omit for latest) |
--embedding-model <model> |
Embedding model configuration | |
--server-url <url> |
Remote pipeline worker URL | |
--quiet |
Suppress non-error diagnostics | |
--verbose |
Enable debug logging |
Example:
npx @arabold/docs-mcp-server@latest refresh react --version 19.0.0
The library and version must already be indexed. Use scrape for first-time
indexing.
remove
Delete a library (or a specific version) from the index.
npx @arabold/docs-mcp-server@latest remove <library> [options]
| Flag | Alias | Description |
|---|---|---|
--version <ver> |
-v |
Specific version to remove (omit to remove latest) |
--server-url <url> |
Remote pipeline worker URL | |
--quiet |
Suppress non-error diagnostics | |
--verbose |
Enable debug logging |
Example:
npx @arabold/docs-mcp-server@latest remove react --version 18.3.1
This is destructive and cannot be undone. Re-run scrape to re-index.
Output behaviour
All three commands write plain-text status messages to stdout and
diagnostics to stderr. The global --output flag is accepted but has no
effect because the output is plain text, not structured data.
In non-interactive sessions, diagnostics are suppressed by default. Use
--verbose (or set LOG_LEVEL=INFO) to re-enable them. Use --quiet to
suppress all non-error diagnostics regardless of session type.
Typical workflow
# 1. Index documentation for the first time
npx @arabold/docs-mcp-server@latest scrape react https://react.dev/reference/react --version 19.0.0
# 2. Later, refresh to pick up any changes
npx @arabold/docs-mcp-server@latest refresh react --version 19.0.0
# 3. Clean up old versions
npx @arabold/docs-mcp-server@latest remove react --version 18.3.1
Important notes
- Scraping can take time. Large documentation sites with hundreds of pages
may run for several minutes. Use
--max-pagesand--max-depthto limit scope when you only need a subset. - Local files must use the
file://URL scheme (e.g.file:///absolute/path/to/docs). --cleanis on by default forscrape, meaning existing documents for the same library+version are removed before re-indexing. Pass--no-cleanto append instead.refreshonly works on previously indexed content. It uses HTTP ETags to skip pages that have not changed, making it much faster than a full re-scrape.
More from arabold/docs-mcp-server
docs-search
>-
21fetch-url
>-
20openspec-propose
Propose a new change with all artifacts generated in one step. Use when the user wants to quickly describe what they want to build and get a complete proposal with design, specs, and tasks ready for implementation.
13openspec-explore
Enter explore mode - a thinking partner for exploring ideas, investigating problems, and clarifying requirements. Use when the user wants to think through something before or during a change.
13openspec-apply-change
Implement tasks from an OpenSpec change. Use when the user wants to start implementing, continue implementation, or work through tasks.
13openspec-archive-change
Archive a completed change in the experimental workflow. Use when the user wants to finalize and archive a change after implementation is complete.
13