firecrawl

SKILL.md

Firecrawl CLI

Use the firecrawl CLI for web scraping, search, and browser automation. It returns clean markdown optimized for LLM context windows and handles JavaScript rendering.

The toolkit has two layers:

  • Core tools — search, scrape, map, crawl, agent. These handle the vast majority of tasks.
  • Browser tools — interactive cloud Chromium sessions (click, fill, scroll, snapshot, etc.). Use only when core tools can't get the data.

Run firecrawl --help or firecrawl <command> --help for full option details on any command.

Prerequisites

The firecrawl CLI must already be installed and authenticated. Check status:

firecrawl --status

If not ready, refer to rules/install.md for setup instructions.

Workflow

Follow this escalation pattern when fetching web data:

  1. Search — Start here when you don't have a specific URL. Find pages, answer questions, discover sources.
  2. Scrape — You have a URL. Extract its content directly. Use --wait-for if JS needs to render.
  3. Map + Scrape — The site is large or you need a specific subpage. Use map --search to find the right URL, then scrape it.
  4. Crawl — You need bulk content from an entire site section (e.g., all docs pages).
  5. Browser — Scrape didn't return the needed data because it's behind interaction (pagination, modals, form submissions).

Note: search --scrape already fetches full page content for every result. Don't scrape those URLs again individually.

Organization

Create a .firecrawl/ folder in the working directory to store results. Add .firecrawl/ to .gitignore. Always use -o to write output to files (avoids flooding context):

firecrawl search "your query" -o .firecrawl/search-{query}.json
firecrawl scrape "<url>" -o .firecrawl/{site}-{path}.md

Organize into subdirectories when it makes sense:

.firecrawl/competitor-research/
.firecrawl/docs/nextjs/
.firecrawl/news/2024-01/

Always quote URLs — shell interprets ? and & as special characters.

Commands

Search

Web search with optional scraping. Run firecrawl search --help for all options.

firecrawl search "your query" -o .firecrawl/search-result.json --json
firecrawl search "your query" --limit 10 -o .firecrawl/search-result.json --json
firecrawl search "your query" --sources news -o .firecrawl/search-news.json --json
firecrawl search "your query" --sources images -o .firecrawl/search-images.json --json
firecrawl search "your query" --tbs qdr:d -o .firecrawl/search-today.json --json
firecrawl search "your query" --scrape -o .firecrawl/search-scraped.json --json
firecrawl search "your query" --categories github -o .firecrawl/search-github.json --json
firecrawl search "your query" --location "San Francisco,California,United States" -o .firecrawl/search-local.json --json

Options: --limit <n>, --sources <web,images,news>, --categories <github,research,pdf>, --tbs <qdr:h|d|w|m|y>, --location, --country <code>, --scrape, --scrape-formats <formats>, -o <path>

Scrape

Single page content extraction. Run firecrawl scrape --help for all options.

firecrawl scrape "<url>" -o .firecrawl/page.md
firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md
firecrawl scrape "<url>" --format markdown,links -o .firecrawl/page.json
firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md
firecrawl scrape "<url>" --include-tags article,main -o .firecrawl/page.md
firecrawl scrape "<url>" --exclude-tags nav,aside -o .firecrawl/page.md
firecrawl scrape "<url>" --html -o .firecrawl/page.html

Don't re-scrape a URL with --html just to extract metadata — that information is already present in the markdown output.

Options: -f <markdown,html,rawHtml,links,screenshot,json>, -H (html shortcut), --only-main-content, --wait-for <ms>, --include-tags, --exclude-tags, -o <path>

Map

Discover all URLs on a site. Run firecrawl map --help for all options.

firecrawl map "<url>" -o .firecrawl/urls.txt
firecrawl map "<url>" --search "keyword" -o .firecrawl/filtered-urls.txt
firecrawl map "<url>" --limit 500 --json -o .firecrawl/urls.json
firecrawl map "<url>" --include-subdomains -o .firecrawl/all-urls.txt

Options: --limit <n>, --search <query>, --sitemap <include|skip|only>, --include-subdomains, --json, -o <path>

Crawl

Crawl an entire website. Run firecrawl crawl --help for all options.

firecrawl crawl "<url>" --wait -o .firecrawl/crawl-result.json
firecrawl crawl "<url>" --limit 100 --max-depth 3 --wait -o .firecrawl/crawl-result.json
firecrawl crawl "<url>" --include-paths /blog,/docs --wait -o .firecrawl/crawl-result.json
firecrawl crawl "<url>" --exclude-paths /admin,/login --wait -o .firecrawl/crawl-result.json
firecrawl crawl "<url>" --delay 1000 --max-concurrency 2 --wait -o .firecrawl/crawl-result.json
firecrawl crawl <job-id>

Options: --wait, --progress, --limit <n>, --max-depth <n>, --include-paths, --exclude-paths, --delay <ms>, --max-concurrency <n>, --poll-interval <seconds>, --timeout <seconds>, -o <path>, --pretty

Agent

AI-powered autonomous web data extraction (takes 2-5 minutes). Run firecrawl agent --help for all options.

firecrawl agent "your extraction prompt" --wait -o .firecrawl/agent-result.json
firecrawl agent "your extraction prompt" --urls "<url>" --wait -o .firecrawl/agent-result.json
firecrawl agent "your extraction prompt" --schema '{"type":"object","properties":{"name":{"type":"string"}}}' --wait -o .firecrawl/agent-result.json
firecrawl agent "your extraction prompt" --schema-file ./schema.json --wait -o .firecrawl/agent-result.json
firecrawl agent "your extraction prompt" --model spark-1-pro --wait -o .firecrawl/agent-result.json
firecrawl agent "your extraction prompt" --max-credits 100 --wait -o .firecrawl/agent-result.json
firecrawl agent <job-id>

Options: --urls <urls>, --model <spark-1-mini|spark-1-pro>, --schema <json>, --schema-file <path>, --max-credits <n>, --wait, --poll-interval <seconds>, --timeout <seconds>, -o <path>, --pretty

Browser

Cloud Chromium sessions for interactive browsing. All sessions run in Firecrawl's remote sandboxed cloud environment. Run firecrawl browser --help for all options.

Never use browser on sites with bot detection (Google, Bing, DuckDuckGo, Cloudflare challenges). Use firecrawl search for web searches instead.

Shorthand (Recommended)

Auto-launches a session if needed:

firecrawl browser "open <url>"
firecrawl browser "snapshot"
firecrawl browser "click @e5"
firecrawl browser "fill @e3 'search query'"
firecrawl browser "scrape" -o .firecrawl/browser-scrape.md
firecrawl browser close

Execute mode

Explicit form with execute subcommand:

firecrawl browser execute "open <url>" -o .firecrawl/browser-result.txt
firecrawl browser execute "snapshot" -o .firecrawl/browser-result.txt
firecrawl browser execute "scrape" -o .firecrawl/browser-scrape.md

Session management

firecrawl browser launch-session --ttl 600
firecrawl browser list
firecrawl browser list active --json
firecrawl browser close
firecrawl browser close --session <id>

Core commands: open <url>, snapshot, screenshot, click <@ref>, type <@ref> <text>, fill <@ref> <text>, scrape, scroll <direction>, wait <seconds>, eval <js>

Options: --ttl <seconds>, --ttl-inactivity <seconds>, --stream, --session <id>, -o <path>

Credit Usage

firecrawl credit-usage
firecrawl credit-usage --json --pretty -o .firecrawl/credits.json

Reading Output Files

Always read and process files you already have before fetching more data. Don't re-scrape a URL you already have content for.

Never read entire output files at once — they're often 1000+ lines. Use grep, head, or incremental reads:

wc -l .firecrawl/file.md && head -50 .firecrawl/file.md
grep -n "keyword" .firecrawl/file.md
grep -A 10 "## Section" .firecrawl/file.md

Format Behavior

  • Single format: Outputs raw content (markdown text, HTML, etc.)
  • Multiple formats: Outputs JSON with all requested data

Combining with Other Tools

jq -r '.data.web[].url' .firecrawl/search-query.json
jq -r '.data.web[] | "\(.title): \(.url)"' .firecrawl/search-query.json
jq -r '.data.news[] | "[\(.date)] \(.title)"' .firecrawl/search-news.json
grep -i "keyword" .firecrawl/page.md

Parallelization

Always run independent operations in parallel, never sequentially. Check firecrawl --status for concurrency limit:

firecrawl scrape "<url-1>" -o .firecrawl/1.md &
firecrawl scrape "<url-2>" -o .firecrawl/2.md &
firecrawl scrape "<url-3>" -o .firecrawl/3.md &
wait

For many URLs, use xargs with -P for parallel execution:

cat urls.txt | xargs -P 10 -I {} sh -c 'firecrawl scrape "{}" -o ".firecrawl/$(echo {} | md5).md"'
Weekly Installs
3
First Seen
Feb 18, 2026
Installed on
augment3
codebuddy3
command-code3
claude-code3
github-copilot3
codex3