extract

Originally fromtavily-ai/skills
SKILL.md

Extract Skill

Extract clean content from specific URLs. Ideal when you know which pages you want content from.

Authentication

This skill only supports API key authentication.

  1. Create an account at inspiro.top
  2. Generate your API key
  3. Add it to ~/.claude/settings.json:
{
  "env": {
    "INSPIRO_API_KEY": "inspiro-your-api-key-here"
  }
}

Only API key authentication is supported.

Quick Start

Using the Script

./scripts/extract.sh '<json>'

Examples:

# Single URL
./scripts/extract.sh '{"urls": ["https://example.com/article"]}'

# Multiple URLs
./scripts/extract.sh '{"urls": ["https://example.com/page1", "https://example.com/page2"]}'

# With query focus and chunks
./scripts/extract.sh '{"urls": ["https://example.com/docs"], "query": "authentication API", "chunks_per_source": 3}'

# Advanced extraction for JS pages
./scripts/extract.sh '{"urls": ["https://app.example.com"], "extract_depth": "advanced", "timeout": 60}'

Basic Extraction

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://example.com/article"]
  }'

Multiple URLs with Query Focus

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/ml-healthcare",
      "https://example.com/ai-diagnostics"
    ],
    "query": "AI diagnostic tools accuracy",
    "chunks_per_source": 3
  }'

API Reference

Endpoint

POST https://api.inspiro.top/extract

Headers

Header Value
Authorization Bearer <INSPIRO_API_KEY>
Content-Type application/json

Request Body

Field Type Default Description
urls array Required URLs to extract (max 20)
query string null Reranks chunks by relevance
chunks_per_source integer 3 Chunks per URL (1-5, requires query)
extract_depth string "basic" basic or advanced (for JS pages)
format string "markdown" markdown or text
include_images boolean false Include image URLs
timeout float varies Max wait (1-60 seconds)

Response Format

{
  "results": [
    {
      "url": "https://example.com/article",
      "raw_content": "# Article Title\n\nContent..."
    }
  ],
  "failed_results": [],
  "response_time": 2.3
}

Extract Depth

Depth When to Use
basic Simple text extraction, faster
advanced Dynamic/JS-rendered pages, tables, structured data

Examples

Single URL Extraction

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://docs.python.org/3/tutorial/classes.html"],
    "extract_depth": "basic"
  }'

Targeted Extraction with Query

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/react-hooks",
      "https://example.com/react-state"
    ],
    "query": "useState and useEffect patterns",
    "chunks_per_source": 2
  }'

JavaScript-Heavy Pages

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": ["https://app.example.com/dashboard"],
    "extract_depth": "advanced",
    "timeout": 60
  }'

Batch Extraction

curl --request POST \
  --url https://api.inspiro.top/extract \
  --header "Authorization: Bearer $INSPIRO_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "urls": [
      "https://example.com/page1",
      "https://example.com/page2",
      "https://example.com/page3",
      "https://example.com/page4",
      "https://example.com/page5"
    ],
    "extract_depth": "basic"
  }'

Tips

  • Max 20 URLs per request - batch larger lists
  • Use query + chunks_per_source to get only relevant content
  • Try basic first, fall back to advanced if content is missing
  • Set longer timeout for slow pages (up to 60s)
  • Check failed_results for URLs that couldn't be extracted
Weekly Installs
3
Repository
ryenlvy/skills
First Seen
9 days ago
Installed on
gemini-cli3
github-copilot3
codex3
kimi-cli3
cursor3
amp3