defuddle

Installation
SKILL.md

defuddle: Web Page Cleaner

Defuddle extracts the meaningful content from a web page and drops everything else: ads, cookie banners, nav bars, related articles, footers, social sharing buttons. What remains is the article body as clean markdown.

Use this before any URL ingestion. It is optional but strongly recommended. It cuts token usage by 40-60% on typical web articles and produces cleaner wiki pages.


Install

npm install -g defuddle-cli

Verify: defuddle --version


Usage

Clean a URL directly

defuddle https://example.com/article

Outputs clean markdown to stdout.

Save to .raw/

defuddle https://example.com/article > .raw/articles/article-slug-$(date +%Y-%m-%d).md

Add frontmatter header after saving

After running defuddle, prepend the source URL and fetch date:

SLUG="article-slug-$(date +%Y-%m-%d)"
{ echo "---"; echo "source_url: https://example.com/article"; echo "fetched: $(date +%Y-%m-%d)"; echo "---"; echo ""; defuddle https://example.com/article; } > .raw/articles/$SLUG.md

Clean a local HTML file

defuddle page.html

When to Use

Use defuddle when:

  • Ingesting a news article, blog post, or documentation page from a URL
  • The page has a lot of surrounding content (most web pages do)
  • You want to stay within token budget on a long article

Skip defuddle when:

  • The source is already a clean markdown or PDF file
  • The page is a dashboard, app, or structured data (defuddle expects article-style content)
  • defuddle is not installed and the article is short enough to process raw

Fallback

If defuddle is not installed, check:

which defuddle 2>/dev/null || echo "not installed"

If not installed: use WebFetch directly. The content will be less clean but still workable.


Integration with /wiki-ingest

The /wiki-ingest skill checks for defuddle automatically when a URL is passed. You do not need to run defuddle manually before ingesting a URL. The ingest skill will call it if available.

To manually clean a page and save before ingesting:

  1. Run the save command above
  2. Then: ingest .raw/articles/[slug].md
Weekly Installs
29
GitHub Stars
1.9K
First Seen
11 days ago
Installed on
warp29
gemini-cli29
deepagents29
antigravity29
amp29
cline29