web-clipper
Web Clipper
Save any web page as clean markdown with YAML frontmatter. Optionally search your clips or ingest them into repo-search for semantic search.
Prerequisites
- Python 3 with venv
Setup
First-Time Setup
~/.claude/skills/web-clipper/setup.sh
This creates a .venv and installs dependencies (trafilatura, requests, python-slugify, pyyaml).
Usage
Clip a URL
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/clip.py <url>
With tags:
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/clip.py <url> --tags "python,web-dev"
JSON output:
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/clip.py <url> -f json
Clips are saved to ~/web-clips/ as markdown files with YAML frontmatter (title, url, domain, author, date, tags).
List Clips
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/list.py
Filter by domain, tag, or date:
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/list.py --domain "example.com"
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/list.py --tag "python"
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/list.py --after 2026-01-01 --before 2026-02-01
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/list.py -f json
Search Clips
Full-text search across all clips:
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/search.py "search terms"
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/search.py "search terms" -f json
Delete a Clip
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/delete.py <filename>
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/delete.py --url "https://example.com/article"
Ingest into Repo Search (semantic search)
Requires the repo-search skill. Pushes all clips into ChromaDB:
~/.claude/skills/web-clipper/.venv/bin/python ~/.claude/skills/web-clipper/scripts/ingest.py
After ingestion, clips are searchable via repo-search:
~/.claude/skills/repo-search/.venv/bin/python ~/.claude/skills/repo-search/query.py "query" --collection web-clips
Error Handling
| Error | Cause | Fix |
|---|---|---|
Could not extract article content |
Page has no extractable article text (e.g., SPA, login wall) | JS-rendered pages may not be extractable |
repo-search skill not found |
repo-search not installed | Run ./install.sh repo-search |
Limitations
- Extracts article text only — does not preserve images, videos, or interactive elements
- JavaScript-rendered SPAs may not be fully extractable
- Login-walled content cannot be accessed
More from dandcg/claude-skills
outlook
Use for email and calendar operations - checking inbox, sending emails, viewing calendar, scheduling events. Trigger on phrases like "check email", "draft email", "my calendar", "schedule", "am I free".
40humanize
Rewrite AI-generated text to sound natural and human. Use for humanizing text, making AI writing undetectable, rewriting to pass AI detectors. Trigger on phrases like "humanize", "make this sound human", "rewrite naturally", "humanize text", "sound more natural", "pass AI detection".
5trello
Manage Trello boards, lists, and cards. Trigger on phrases like "trello", "my boards", "shopping list", "create card", "move card", "sort cards".
3flaresolverr
Use when any URL returns 403, a Cloudflare challenge page, or "Just a moment..." - bypasses anti-bot protection via a real browser in Docker. Trigger on phrases like "scrape", "fetch blocked", "403", "cloudflare", "can't access site".
2email-search
Process email archives (PST files) into a searchable ChromaDB vector database with automatic semantic embeddings. Ingest, classify, search, analyse, and export to markdown. Trigger on phrases like "email archive", "ingest pst", "search emails", "email analytics", "export contacts", "email timeline".
2repo-search
Semantic search and summarisation across a document corpus (markdown, PDF, DOCX, XLSX). Use when needing to find information across many files, build timelines, aggregate knowledge, or answer questions about content. Trigger on phrases like "search brain", "find in my notes", "what do I know about", "summarise", "timeline of", "aggregate".
2