crawl4ai-fetch

Installation

SKILL.md

crawl4ai-fetch

Use scripts/crawl.py to fetch a URL and return its content as Markdown.

Configuration

Configuration is resolved in the following priority order:

Environment variables (highest priority)
.env file in the current working directory (auto-loaded if present)
Built-in defaults

Env var	Purpose	Default
`CRAWL4AI_URL`	Base URL of crawl4ai instance	`https://crawl.981234.xyz`
`CRAWL4AI_TOKEN`	Bearer token for auth (optional)	(empty = no auth header sent)

Example .env:

CRAWL4AI_URL=https://crawl.example.com
CRAWL4AI_TOKEN=your-secret-token

Usage

# Basic fetch
python3 scripts/crawl.py "https://example.com/"

# Use bm25 filter with a relevance query (returns only the most relevant sections)
python3 scripts/crawl.py "https://docs.example.com/api" --filter bm25 --query "authentication"

# Custom instance with auth
CRAWL4AI_URL=https://crawl.example.com CRAWL4AI_TOKEN=my-token python3 scripts/crawl.py "https://example.com/"

Filter modes

Mode	Description
`fit`	(default) Smart extraction — removes boilerplate, keeps main content
`raw`	Full page Markdown with no filtering
`bm25`	BM25-ranked relevance filter; requires `--query`

Output format

Plain Markdown text printed to stdout. Pipe or capture as needed:

python3 scripts/crawl.py "https://example.com/" > page.md

On failure, an error message is printed to stderr and the script exits with code 1.

Workflow

Run the script with the target URL, capturing stdout.
Pass the Markdown content to the LLM for summarization, Q&A, or analysis.
For long pages, use --filter bm25 --query "topic" to get only the relevant sections.

Notes

Timeout is 60 s to allow for JavaScript-heavy pages.
If CRAWL4AI_TOKEN is unset or empty, the Authorization header is omitted (public instances).
Always fetches fresh content (c=0); server-side cache is not used.

Related skills

More from ichuan/skills

Installs

Repository

ichuan/skills

GitHub Stars

First Seen

Apr 24, 2026

Security Audits

Gen Agent Trust HubFail

SocketWarn

SnykWarn

crawl4ai-fetch

crawl4ai-fetch

Configuration

Usage

Filter modes

Output format

Workflow

Notes

More from ichuan/skills

roadmap-management

searxng-search

iterative-code-review

pre-commit-review

deploy-caddy-reverse-proxy