web-browse
SKILL.md
Web Browse
Fetch web pages and extract readable text content.
Quick fetch (raw HTML)
curl -sL "https://example.com" | head -200
Extract text with Python
curl -sL "https://example.com" | python3 -c "
import sys, html, re
raw = sys.stdin.read()
text = re.sub(r'<script[^>]*>.*?</script>', '', raw, flags=re.DOTALL)
text = re.sub(r'<style[^>]*>.*?</style>', '', text, flags=re.DOTALL)
text = re.sub(r'<[^>]+>', ' ', text)
text = html.unescape(text)
text = re.sub(r'\s+', ' ', text).strip()
print(text[:8000])
"
Get page title and meta
curl -sL "https://example.com" | python3 -c "
import sys, re
h = sys.stdin.read()
title = re.search(r'<title>(.*?)</title>', h, re.I|re.S)
desc = re.search(r'<meta[^>]*name=[\"']description[\"'][^>]*content=[\"'](.*?)[\"']', h, re.I)
print(f'Title: {title.group(1).strip() if title else \"N/A\"}')
print(f'Description: {desc.group(1).strip() if desc else \"N/A\"}')
"
Download a file
curl -sL -o /tmp/file.pdf "https://example.com/report.pdf"
Notes
- Respect robots.txt. Do not scrape excessively.
- Use
-Lto follow redirects. - For JavaScript-heavy sites, consider the browser skill.
Weekly Installs
1
Repository
thinkfleetai/th…t-engineFirst Seen
Mar 1, 2026
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1