web-scraper

Warn

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The browser_session.py script includes an eval action that allows for the execution of arbitrary JavaScript code within the Playwright browser instance via page.evaluate(). This provides a significant attack surface if the agent is manipulated into executing malicious logic.
  • [EXTERNAL_DOWNLOADS]: The download_file.py script explicitly disables SSL certificate verification (verify=False) as a fallback mechanism when a request fails due to an SSLError. This exposes the connection to potential Man-in-the-Middle (MitM) attacks.
  • [COMMAND_EXECUTION]: The browser_session.py script uses os.fork() and os.setsid() to create a persistent background daemon process on the host system. While used for maintaining browser state, such persistence techniques are often associated with maintaining unauthorized access.
  • [REMOTE_CODE_EXECUTION]: By combining the ability to browse arbitrary URLs with the eval command, the skill allows for the dynamic execution of code fetched from the internet within the context of the browser session.
  • [DATA_EXFILTRATION]: The skill is designed to extract content (text, links, screenshots, PDFs) from the web and return it to the agent. This represents a data ingestion and potential exfiltration surface, especially since it automatically dismisses cookie banners to access more content.
  • [PROMPT_INJECTION]: The skill has a high exposure to indirect prompt injection (Category 8) because it fetches and processes untrusted data from arbitrary websites (via google_search.py, read_page.py, and browser_session.py).
  • Ingestion points: Page text extraction in read_page.py, browser_session.py, and download_file.py (PDF text).
  • Boundary markers: None detected; the content is returned as clean text or markdown without explicit delimiters for the agent.
  • Capability inventory: File writing (screenshots/downloads), network requests (playwright/requests), and background process creation.
  • Sanitization: The EXTRACT_JS script performs basic filtering of <script> and <style> tags but otherwise extracts raw innerText.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 5, 2026, 07:36 AM