crawler

Pass

Audited by Gen Agent Trust Hub on Mar 14, 2026

Risk Level: SAFEPROMPT_INJECTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: Indirect Prompt Injection Surface. The skill fetches and converts content from arbitrary external URLs into markdown for agent processing. This content is untrusted and could contain malicious instructions intended to manipulate the agent's behavior (Category 8).
  • Ingestion points: scripts/crawl.py via modules firecrawl_scraper.py, jina_reader.py, and scrapling_scraper.py.
  • Boundary markers: Absent. The output markdown is not wrapped in protective delimiters or accompanied by instructions to ignore embedded commands.
  • Capability inventory: The skill can write files to the local system and perform network requests.
  • Sanitization: Content is converted from HTML to Markdown, which strips executable HTML/JS but leaves natural language instructions intact.
  • [DATA_EXFILTRATION]: File Write Capability. The scripts/crawl.py script and scripts/src/scrapling_scraper.py module provide an --output parameter that allows writing the scraped content to an arbitrary file path. While instructions recommend a specific directory, the script does not programmatically enforce path restrictions, allowing for potential file overwrites if the agent is misled.
  • [EXTERNAL_DOWNLOADS]: Remote Content Retrieval. The skill communicates with external services and libraries to retrieve web data:
  • Fetches content via Jina Reader at https://r.jina.ai/ (a well-known service).
  • Integrates with the Firecrawl API for structured scraping.
  • Uses the scrapling library for local headless browser scraping, which may download browser binaries as part of its standard operation.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 14, 2026, 02:41 AM