web-content-downloader
Web Content Downloader
This skill is used to download specified web page content and convert it into Markdown format while retaining the original language (no translation). It also automatically extracts and downloads images from the web page into a local img directory, renames the images meaningfully, and correctly updates their references in the Markdown file.
Workflow
Strictly follow these steps:
1. Fetch and Convert Web Content
- Primary Method: Use Jina Reader to fetch the content. Execute the terminal command
curl -s "https://r.jina.ai/<TARGET_URL>". This directly returns the main body in well-formatted Markdown. - Fallback Method: If Jina Reader fails or returns empty content, fall back to other tools (such as
mcp_DuckDuckGo_Search_Server_fetch_content,curl, or equivalent tools) to fetch the HTML content of the target URL, then clean and convert the HTML into Markdown. - Keep Original Language: Do NOT translate the content. Retain the original language of the web page.
- Table Handling: If the original web page contains tables (HTML
<table>), they MUST be accurately converted into standard Markdown table format (|---|---|). Avoid using HTML line break tags like<br>inside Markdown tables to keep them clean. - Formatting Rules: Ensure the typography follows standard rules: insert spaces between Chinese and English characters; ensure code blocks have explanatory comments.
2. Extract Image Links
- Parse the original web content or HTML to extract all core image links. Pay attention not only to
<img src="...">and Markdown image syntax, but also to modern responsive tags like<source srcset="...">orsrcsetattributes. - Filter out meaningless images such as favicons, tracking pixels, or tiny UI icons.
3. Create Directory and Download Images
- Check if an
imgdirectory exists in the current workspace. If not, create it using the terminal commandmkdir -p img. - Use terminal commands (e.g.,
curl -O) to batch download the extracted images into theimgdirectory.
4. Rename Images
- Generate meaningful English or Pinyin names for the downloaded images based on the article's context or what the image depicts (e.g.,
design-patterns.png). - Use terminal commands (e.g.,
mv) to rename the downloaded images.
5. Update Markdown References
- Replace the original remote image links in the Markdown document with the local image links (e.g.,
). - Save the final Markdown text to the file specified by the user, or generate a
.mdfile based on the web page title and save it.
More from forceinjection/awesome-skills
doc-reviewer
审查技术文档。支持四种独立评审类型:大纲评审(检查目录与结构逻辑)、内容评审(检查文字准确性与代码质量)、资产评审(校验链接与引用合规)、格式评审(校对纯视觉排版与标点)。当用户请求审查或修正 Markdown 文档时使用。
5dir-organizer
整理和优化项目目录结构。当用户请求整理目录、分类文件、清理无用文件或重构文件夹结构时调用此技能。
4code-reader
Use when you want to deeply understand an unfamiliar codebase and generate reusable cognitive skills from it, by providing a local path or GitHub URL
4md-summarizer
分析和总结指定的本地 Markdown 文件,并输出结构化的中文总结。当用户请求总结、分析或提取本地 Markdown 文档信息时调用此技能。
4update-submitter
Analyzes git status, groups related file changes, and generates standardized Conventional Commits. Invoke when the user wants to commit changes, submit updates, or create a PR.
3md-translator
将指定的本地 Markdown 文件翻译成指定语言(默认中文),并在文件名中添加语言标识后缀。当用户请求翻译本地 Markdown 文档时调用此技能。
3