web-scraper
Warn
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: MEDIUMEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (MEDIUM): The skill processes untrusted data from external websites via fetch_page.py and extract_text.py. This creates an indirect prompt injection surface where malicious instructions embedded in scraped HTML or text could influence the agent's subsequent reasoning or actions.
- Ingestion points: URL content fetched by fetch_page.py, extract_links.py, and extract_text.py.
- Boundary markers: Absent. There are no instructions to the agent to treat scraped content as data rather than instructions.
- Capability inventory: Network access (requests), file writing (--output flag), and potential for the agent to use extracted text for decision making.
- Sanitization: Absent. The skill extracts raw HTML or text without filtering for potential injection patterns.
- [DATA_EXFILTRATION] (LOW): The skill performs network operations to arbitrary URLs. While this is the intended purpose of a web scraper, network access is a prerequisite for data exfiltration.
- [EXTERNAL_DOWNLOADS] (LOW): The skill requires the installation of external Python packages (requests, beautifulsoup4). These are standard and trusted libraries, though no versions are pinned in the requirements.
Audit Metadata