Firecrawl Extension for Claude SEO

This skill requires the Firecrawl extension to be installed:

./extensions/firecrawl/install.sh

Check availability: Before using any Firecrawl tool, verify the MCP server is connected by checking if firecrawl_scrape or any Firecrawl tool is available. If tools are not available, inform the user the extension is not installed and provide install instructions.

Quick Reference

Command	Purpose
`/seo firecrawl crawl <url>`	Full-site crawl with content extraction
`/seo firecrawl map <url>`	Discover site structure (URLs only, fast)
`/seo firecrawl scrape <url>`	Single-page scrape with JS rendering
`/seo firecrawl search <query> <url>`	Search within a crawled site

Commands

crawl -- Full-Site Crawl

Crawl an entire website starting from the given URL. Returns page content, metadata, and links for all discovered pages.

MCP Tool: firecrawl_crawl

Parameters:

url (required): Starting URL to crawl
limit: Max pages to crawl (default: 100, max: 500)
maxDepth: Max link depth from start URL (default: 3)
includePaths: Array of glob patterns to include (e.g., ["/blog/*"])
excludePaths: Array of glob patterns to exclude (e.g., ["/admin/*", "/api/*"])
scrapeOptions.formats: Output formats -- ["markdown", "html", "links"]

SEO Usage Patterns:

Comprehensive audit crawl: Crawl full site, extract all pages for subagent analysis
Section-focused crawl: Use includePaths to audit only /blog/* or /products/*
Broken link detection: Crawl with ["links"] format, check all hrefs for 404s
Content inventory: Extract all page titles, meta descriptions, H1s at scale
SPA/JS-rendered sites: Firecrawl renders JavaScript, solving the Issue #11 problem

Example orchestration for /seo audit:

1. firecrawl_map(url) -> get all URLs (fast, no content)
2. Filter to top 50 most important pages (homepage, key sections)
3. firecrawl_crawl(url, limit=50) -> get full content
4. Feed content to seo-technical, seo-content, seo-schema agents

Cost awareness:

Free tier: 500 credits/month
1 credit = 1 page crawled or scraped
Map operations are cheaper (0.5 credits per URL discovered)
Always inform user of estimated credit usage before large crawls

map -- Site Structure Discovery

Discover all URLs on a website without fetching content. Fast and credit-efficient.

MCP Tool: firecrawl_map

Parameters:

url (required): Website URL to map
limit: Max URLs to discover (default: 5000)
search: Optional search term to filter URLs

SEO Usage Patterns:

Sitemap comparison: Map site, compare discovered URLs vs XML sitemap
Orphan page detection: URLs in sitemap but not linked from any page
Crawl budget analysis: Total indexable pages vs pages linked from homepage
URL pattern analysis: Identify URL structure patterns, duplicates, parameter bloat
Pre-audit discovery: Run map first, then targeted crawl on key sections

Output: Array of URLs. Present as:

Site: example.com
Pages discovered: 342

URL Pattern Breakdown:
  /blog/*          - 128 pages (37%)
  /products/*      - 89 pages (26%)
  /category/*      - 45 pages (13%)
  /pages/*         - 32 pages (9%)
  / (root pages)   - 48 pages (14%)

scrape -- Single-Page Deep Scrape

Scrape a single page with full JavaScript rendering. More thorough than fetch_page.py because it executes JS and waits for dynamic content.

MCP Tool: firecrawl_scrape

Parameters:

url (required): Page URL to scrape
formats: Output formats -- ["markdown", "html", "links", "screenshot"]
onlyMainContent: Strip nav/footer/sidebar (default: true)
waitFor: CSS selector or milliseconds to wait for content
timeout: Request timeout in ms (default: 30000)
actions: Browser actions before scraping (click, scroll, wait)

SEO Usage Patterns:

SPA content extraction: Scrape JS-rendered React/Vue/Angular pages
Dynamic content audit: Pages with lazy-loaded content below the fold
Paywall/login detection: Identify content behind authentication walls
Main content extraction: Use onlyMainContent for clean E-E-A-T analysis
Screenshot capture: Use screenshot format for visual analysis

When to use scrape vs fetch_page.py:

Scenario	Use
Static HTML page	`fetch_page.py` (no API cost)
JS-rendered SPA	`firecrawl_scrape` (renders JS)
Need response headers	`fetch_page.py` (returns headers)
Need clean markdown	`firecrawl_scrape` (better extraction)
Rate-limited/blocked	`firecrawl_scrape` (handles anti-bot)

search -- Site-Scoped Search

Search within a website for specific content. Useful for finding pages related to a topic without crawling everything.

MCP Tool: firecrawl_search

Parameters:

query (required): Search query
url (required): Website to search within
limit: Max results (default: 10)
scrapeOptions.formats: Output format for matched pages

SEO Usage Patterns:

Content gap validation: Search for a keyword on the site to check if content exists
Internal linking opportunities: Find pages mentioning a topic that could link to each other
Duplicate content detection: Search for key phrases to find near-duplicates
Competitor content research: Search competitor site for specific topics

Cross-Skill Integration

With seo-audit (full audit)

When Firecrawl is available during /seo audit:

Use firecrawl_map to discover all site URLs
Compare with XML sitemap (seo-sitemap) to find orphan/missing pages
Select top pages for deep analysis
Feed crawled content to all subagents (technical, content, schema, geo)
Report total crawlable pages, URL patterns, and crawl depth

With seo-technical

Broken link detection: crawl all internal links, check for 404s
Redirect chain mapping: follow all redirects, flag chains > 2 hops
Mixed content detection: check HTTP resources on HTTPS pages
Canonical verification: compare canonical URLs with actual URLs

With seo-sitemap

Sitemap coverage: % of crawled pages present in sitemap
Orphan pages: pages found by crawl but missing from sitemap
Stale sitemap entries: URLs in sitemap that return 404/410

With seo-content

Content extraction: feed clean markdown to E-E-A-T analysis
Thin content detection: identify pages with < 300 words at scale
Duplicate content: compare content across pages for near-duplicates

With seo-schema

Schema extraction: pull JSON-LD from all crawled pages
Schema coverage: % of pages with structured data
Schema validation: batch-validate extracted schemas

Error Handling

Error	Cause	Resolution
`FIRECRAWL_API_KEY not set`	MCP not configured	Run `./extensions/firecrawl/install.sh`
`402 Payment Required`	Credits exhausted	Check usage at firecrawl.dev/app, upgrade plan
`429 Too Many Requests`	Rate limited	Wait 60s, reduce crawl concurrency
`408 Timeout`	Page too slow to render	Increase `timeout`, try without JS rendering
`403 Forbidden`	Site blocks crawling	Check robots.txt, may need to skip this site

Graceful fallback: If Firecrawl is unavailable, inform the user and suggest:

Use fetch_page.py for single-page analysis (no API cost)
Use WebFetch tool for basic HTML retrieval
Install Firecrawl: ./extensions/firecrawl/install.sh

seo-firecrawl

Firecrawl Extension for Claude SEO

Quick Reference

Commands

crawl -- Full-Site Crawl

map -- Site Structure Discovery

scrape -- Single-Page Deep Scrape

search -- Site-Scoped Search

Cross-Skill Integration

With seo-audit (full audit)

With seo-technical

With seo-sitemap

With seo-content

With seo-schema

Error Handling