ralph-driven-development

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION] (HIGH): The skill documentation explicitly instructs users to run the AI agent with flags that bypass safety filters and approval steps (e.g., --dangerously-bypass-approvals-and-sandbox). This removes standard AI guardrails and makes the system vulnerable to adversarial instructions.
  • [COMMAND_EXECUTION] (HIGH): The ralph.py script executes external binaries (defaulting to codex) with user-provided arguments. While it uses subprocess.run without shell=True, it encourages the use of high-privilege flags that allow the AI to perform unrestricted system operations.
  • [INDIRECT_PROMPT_INJECTION] (HIGH): The skill possesses a significant indirect injection surface by processing untrusted specification files from the docs/tasks/ directory.
  • Ingestion points: scripts/ralph.py reads filenames from a tasks directory and prompts the agent to implement their content.
  • Boundary markers: Absent. There are no delimiters or instructions to the agent to treat the spec content as untrusted data.
  • Capability inventory: When running with the recommended bypass flags, the agent has full filesystem and execution access to the repository and host environment.
  • Sanitization: None. The script does not validate the content of the specifications before directing the agent to follow them.
  • [REMOTE_CODE_EXECUTION] (HIGH): By design, the skill executes code generated or suggested by an AI based on untrusted inputs. In an adversarial scenario, a spec file could contain instructions to execute malicious payloads, which the agent would perform due to the disabled security sandbox.
  • [DATA_EXPOSURE] (MEDIUM): The script writes agent-generated "learnings" to AGENTS.md. This creates a secondary injection surface where a malicious spec can poison the knowledge base for all future agent runs, leading to persistent compromise.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 03:05 AM