brightdata
SKILL.md
BrightData - Progressive URL Scraping
Intelligent URL content retrieval with automatic fallback strategy.
Activation Triggers
Direct Scraping Requests
- "scrape this URL", "scrape [URL]", "scrape this page"
- "fetch this URL", "fetch [URL]", "fetch this page"
- "pull content from [URL]", "get content from [URL]"
- "retrieve [URL]", "download this page content"
Access Issues
- "can't access this site", "site is blocking me"
- "bot detection", "CAPTCHA", "403 error"
- "this URL won't load"
Explicit Tier Requests
- "use Bright Data to fetch [URL]" - Skip to Tier 4
- "use browser to scrape [URL]" - Skip to Tier 3
Core Capability
Four-Tier Progressive Escalation:
START
|
Tier 1 (WebFetch) --> Success? --> Return content
|
No
|
Tier 2 (Curl) --> Success? --> Return content
|
No
|
Tier 3 (Browser) --> Success? --> Return content
|
No
|
Tier 4 (Bright Data) --> Success? --> Return content
|
No
|
Report failure + alternatives
Tier Details
Tier 1: WebFetch (Built-in)
- Tool: Claude Code's WebFetch
- Speed: ~2-5 seconds
- Cost: Free
- Works for: Public sites, no bot detection
Tier 2: Curl with Chrome Headers
- Tool: Bash curl with comprehensive browser headers
- Speed: ~3-7 seconds
- Cost: Free
- Works for: Sites with basic user-agent filtering
Tier 3: Browser Automation (Playwright)
- Tool: Browser skill's Playwright automation
- Speed: ~10-20 seconds
- Cost: Free
- Works for: JavaScript SPAs, dynamic content
Tier 4: Bright Data MCP
- Tool:
mcp__Brightdata__scrape_as_markdown - Speed: ~5-15 seconds
- Cost: Bright Data credits
- Works for: CAPTCHA, advanced bot detection, residential IPs
Workflow Routing
Default workflow for all URL scraping:
- Route to:
Workflows/FourTierScrape.md - Output: URL content in markdown format
Examples
Example 1: Simple Site (Tier 1 Success)
User: "Scrape https://example.com"
Process:
- Attempt Tier 1 (WebFetch)
- Success in 3 seconds
- Return markdown content
Example 2: JavaScript Site (Tier 3 Success)
User: "Fetch https://spa-app.com"
Process:
- Tier 1 fails (blocked)
- Tier 2 fails (JavaScript required)
- Tier 3 succeeds with Playwright
- Return markdown content
Example 3: Protected Site (Tier 4 Success)
User: "Can't access https://protected-site.com"
Process:
- Tier 1 fails (403)
- Tier 2 fails (bot detection)
- Tier 3 fails (CAPTCHA)
- Tier 4 succeeds with Bright Data
- Return markdown content
Example 4: Direct Tier Request
User: "Use Bright Data to fetch https://any-site.com"
Process:
- User explicitly requested Bright Data
- Skip directly to Tier 4
- Return markdown content
Integration Points
- WebFetch Tool - Built-in Claude Code tool
- Bash Tool - For curl commands
- Browser Skill - For Playwright automation (requires pai-browser-skill)
- Bright Data MCP - For professional scraping
Related Documentation
Workflows/FourTierScrape.md- Complete workflow implementationINSTALL.md- Installation guideVERIFY.md- Verification checklist
Last Updated: 2026-01-14
Weekly Installs
4
First Seen
2 days ago
Installed on
claude-code3
clawdbot2
opencode2
cursor2
codex2
antigravity2