ralph-dogfooding

Fail

Audited by Gen Agent Trust Hub on Feb 25, 2026

Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/ralph-dogfooding.sh invokes the agent with the --dangerously-bypass-approvals-and-sandbox flag. This removes the agent security sandbox and disables all human-in-the-loop approvals, granting the agent full access to the host machine's filesystem, network, and shell.
  • [EXTERNAL_DOWNLOADS]: The skill setup (SKILL.md and references/linear-cli.md) requires the installation of an external skill from an untrusted GitHub repository: npx skills add https://github.com/schpet/linear-cli.
  • [PROMPT_INJECTION]: The skill is designed to browse untrusted external web applications and ingest their DOM structure via Playwright snapshots (browser_snapshot), creating a surface for indirect prompt injection. Combined with the sandbox bypass, this allows an attacker-controlled website to potentially execute arbitrary commands on the user's host.
  • [INDIRECT_PROMPT_INJECTION_EVIDENCE]: 1. Ingestion points: Data enters the agent context through Playwright snapshots and screenshots of the target web application. 2. Boundary markers: There are no markers or instructions used to delimit untrusted website content from agent instructions. 3. Capability inventory: The agent runs with full unsandboxed shell access and can create/modify local files (e.g., .ralph/ directory). 4. Sanitization: Content retrieved from the web via MCP is not sanitized or validated before being processed by the agent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 25, 2026, 02:24 PM