web-content-extractor
Web Content Extractor
This skill helps you extract the clean main content (body text/Markdown) from a webpage URL by using Defuddle or Jina AI's reader API.
How it works
To extract the content of a target URL, you will prepend a specific service URL to the target URL and fetch it. This converts the messy webpage into clean Markdown containing only the main content.
Available Services
-
Defuddle (Default)
- Format:
https://defuddle.md/<target-url> - Example:
https://defuddle.md/https://example.com/article - Use this as the primary method.
- Format:
-
Jina AI Reader (Fallback)
- Format:
https://r.jina.ai/<target-url> - Example:
https://r.jina.ai/https://example.com/article - Use this if Defuddle fails or returns an error.
- Format:
Execution Steps
- Identify the target URL: Extract the full URL the user wants to read from their request. Ensure it includes the protocol (e.g.,
https://). - Construct the fetch URL: Prepend
https://defuddle.md/to the target URL. - Fetch the content: Use the
shell_executetool withcurl -sL "FETCH_URL"to download the content.- Example command:
curl -sL "https://defuddle.md/https://example.com/article"
- Example command:
- Handle Fallbacks: If the
curlcommand fails, returns empty, or returns an error message indicating failure, try the Jina AI service instead:curl -sL "https://r.jina.ai/https://example.com/article" - Process the output: The output will be in Markdown format.
- If the user asked you to read it to answer a question, use the content to answer.
- If the user asked you to extract or save it, present the Markdown to them or save it to a file as requested.
Notes
- Always enclose the URL in quotes in the
curlcommand to prevent shell interpretation of special characters like&or?. - If the target URL is missing
http://orhttps://, prependhttps://before appending it to the service URL.
More from openminis/minisskills
douyin-downloader
Download Douyin (抖音) videos from share links. Parse Douyin share text/links, download watermark-free videos, and transcribe audio to text using Volcano Engine ASR (Doubao Speech). Uses Python for iSH compatibility.
11web-search
>
7twitter-x-hub
>
7doubao-tts
使用豆包语音合成(Volcengine TTS)将文本转为语音文件。当用户提到"豆包TTS"、"豆包语音合成"、"doubao tts"、"火山引擎TTS"、"volcengine tts"、"语音合成"、"文字转语音"、"TTS"、"生成音频"、"朗读文字",或任何需要调用豆包/火山引擎语音合成 API 的场景,必须触发本技能。
6exa-search
Search the web, read webpages as markdown, and run filtered web retrieval with Exa MCP. Use this skill whenever the user asks for current web information, web research, domain/date/category-filtered search, company or people lookup via search filters, or extracting clean page content from one or more URLs.
6bilibili-hub
>
6