doc-scraper
Warn
Audited by Snyk on Mar 13, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 1.00). The CLI tool and scripts (scripts/doc_scraper.py: discover_urls, fetch_and_convert, scrape_urls) together with scripts/scraper_config.yaml (sitemap_url: "https://docs.snowflake.com/en/sitemap.xml") explicitly fetch and parse public docs.snowflake.com pages and use the page content and extracted links to drive spidering and subsequent actions, so third‑party web content can materially influence the agent's behavior.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.90). The script's runtime bootstrap auto-installer calls external install scripts and executes them (curl -LsSf https://astral.sh/uv/install.sh | sh and irm https://astral.sh/uv/install.ps1 | iex), which downloads and runs remote code to satisfy a required dependency (uv), so these URLs present a high-risk runtime remote-code execution vector.
Issues (2)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
Audit Metadata