extract
SKILL.md
Extract Skill
Extract clean content from specific URLs. Ideal when you know which pages you want content from.
Prerequisites
Tavily API Key Required - Get your key at https://tavily.com
Add to ~/.claude/settings.json:
{
"env": {
"TAVILY_API_KEY": "tvly-your-api-key-here"
}
}
Quick Start
Using the Script
./scripts/extract.sh '<json>'
Examples:
# Single URL
./scripts/extract.sh '{"urls": ["https://example.com/article"]}'
# Multiple URLs
./scripts/extract.sh '{"urls": ["https://example.com/page1", "https://example.com/page2"]}'
# With query focus and chunks
./scripts/extract.sh '{"urls": ["https://example.com/docs"], "query": "authentication API", "chunks_per_source": 3}'
# Advanced extraction for JS pages
./scripts/extract.sh '{"urls": ["https://app.example.com"], "extract_depth": "advanced", "timeout": 60}'
Basic Extraction
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": ["https://example.com/article"]
}'
Multiple URLs with Query Focus
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": [
"https://example.com/ml-healthcare",
"https://example.com/ai-diagnostics"
],
"query": "AI diagnostic tools accuracy",
"chunks_per_source": 3
}'
API Reference
Endpoint
POST https://api.tavily.com/extract
Headers
| Header | Value |
|---|---|
Authorization |
Bearer <TAVILY_API_KEY> |
Content-Type |
application/json |
Request Body
| Field | Type | Default | Description |
|---|---|---|---|
urls |
array | Required | URLs to extract (max 20) |
query |
string | null | Reranks chunks by relevance |
chunks_per_source |
integer | 3 | Chunks per URL (1-5, requires query) |
extract_depth |
string | "basic" |
basic or advanced (for JS pages) |
format |
string | "markdown" |
markdown or text |
include_images |
boolean | false | Include image URLs |
timeout |
float | varies | Max wait (1-60 seconds) |
Response Format
{
"results": [
{
"url": "https://example.com/article",
"raw_content": "# Article Title\n\nContent..."
}
],
"failed_results": [],
"response_time": 2.3
}
Extract Depth
| Depth | When to Use |
|---|---|
basic |
Simple text extraction, faster |
advanced |
Dynamic/JS-rendered pages, tables, structured data |
Examples
Single URL Extraction
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": ["https://docs.python.org/3/tutorial/classes.html"],
"extract_depth": "basic"
}'
Targeted Extraction with Query
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": [
"https://example.com/react-hooks",
"https://example.com/react-state"
],
"query": "useState and useEffect patterns",
"chunks_per_source": 2
}'
JavaScript-Heavy Pages
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": ["https://app.example.com/dashboard"],
"extract_depth": "advanced",
"timeout": 60
}'
Batch Extraction
curl --request POST \
--url https://api.tavily.com/extract \
--header "Authorization: Bearer $TAVILY_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"urls": [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3",
"https://example.com/page4",
"https://example.com/page5"
],
"extract_depth": "basic"
}'
Tips
- Max 20 URLs per request - batch larger lists
- Use
query+chunks_per_sourceto get only relevant content - Try
basicfirst, fall back toadvancedif content is missing - Set longer
timeoutfor slow pages (up to 60s) - Check
failed_resultsfor URLs that couldn't be extracted
Weekly Installs
30
Repository
tavily-ai/tavily-pluginsFirst Seen
Jan 26, 2026
Security Audits
Installed on
claude-code30
gemini-cli22
antigravity20
cursor18
opencode18
windsurf15