web-scraper

Warn

Audited by Gen Agent Trust Hub on Mar 7, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill directs the agent to use bash and curl to fetch content from URLs provided by the user or discovered through search. There is a risk of shell command injection if the agent concatenates these untrusted URLs directly into shell strings without proper escaping or sanitization.
  • [REMOTE_CODE_EXECUTION]: Strategy C in the skill instructions involves piping remote data directly from a curl request into a python3 process for parsing. While the Python code itself is defined within the skill, this pattern is dangerous when processing untrusted network data (such as XML), as it may expose the system to exploits like XML External Entity (XXE) attacks or other parser vulnerabilities. Additionally, the skill's reliance on generating and executing JavaScript via the javascript_tool and Python scripts via bash at runtime constitutes a dynamic execution risk.
  • [EXTERNAL_DOWNLOADS]: The skill is designed to download files (CSV, XML, JSON) from arbitrary external URLs into the /tmp directory. This capability could be used to retrieve potentially malicious payloads or facilitate data exfiltration if the agent's logic is manipulated by an attacker.
  • [PROMPT_INJECTION]: As a data extraction tool, the skill is highly susceptible to indirect prompt injection from the websites it scrapes.
  • Ingestion points: Untrusted data enters the agent's context through WebFetch, browser page reads, and file downloads.
  • Boundary markers: The skill uses structured prompts for extraction (e.g., 'Extract [DATA_TARGET]'), but it lacks explicit instructions to ignore instructions or 'jailbreak' attempts embedded within the scraped content.
  • Capability inventory: The agent has extensive capabilities including shell access, browser automation, and file system operations, which increases the impact of a successful injection.
  • Sanitization: While the skill includes data cleaning for whitespace and Unicode normalization, it does not implement sanitization or filtering to detect and prevent malicious instructions inside the scraped data.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 7, 2026, 03:42 PM