firecrawl-research-patterns

Warn

Audited by Gen Agent Trust Hub on Apr 20, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill instructs the agent to execute shell commands on a remote host via SSH to manage Docker containers and systemd services. It specifically uses sudo systemctl --user -M kab@ to manage services belonging to a different user, which represents a high level of system control and privilege management.
  • [REMOTE_CODE_EXECUTION]: The bootstrap guide provides instructions for generating a Bun-based scraper script (firecrawl-scraper.ts) and configuring it as a persistent systemd service. This involves writing executable code to the filesystem and ensuring its persistent execution across sessions.
  • [EXTERNAL_DOWNLOADS]: The skill guides the agent to download the Firecrawl repository from GitHub and the Caddy binary from its official releases. It also involves downloading LaTeX source archives from arXiv for document conversion and processing.
  • [PROMPT_INJECTION]: The skill processes scraped web content via an LLM to extract learnings and generate follow-up search queries. This workflow is vulnerable to indirect prompt injection, as malicious instructions embedded in scraped websites could influence the agent's behavior during extraction and recursion. Mandatory Evidence:
  • Ingestion points: Web content is ingested via Firecrawl and Jina Reader (described in SKILL.md Section 3 and recursive-research-protocol.md).
  • Boundary markers: No explicit boundary markers or instructions to ignore embedded commands are provided for the extraction step.
  • Capability inventory: The skill has access to Bash, Write, Edit, and network tools, providing a significant impact path if an injection succeeds.
  • Sanitization: Content is trimmed for token limits, but no sanitization or validation of the scraped text is performed before LLM processing.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 20, 2026, 09:36 AM