ralph-driven-development
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The skill documentation explicitly instructs users to run the AI agent with flags that bypass safety filters and approval steps (e.g.,
--dangerously-bypass-approvals-and-sandbox). This removes standard AI guardrails and makes the system vulnerable to adversarial instructions. - [COMMAND_EXECUTION] (HIGH): The
ralph.pyscript executes external binaries (defaulting tocodex) with user-provided arguments. While it usessubprocess.runwithoutshell=True, it encourages the use of high-privilege flags that allow the AI to perform unrestricted system operations. - [INDIRECT_PROMPT_INJECTION] (HIGH): The skill possesses a significant indirect injection surface by processing untrusted specification files from the
docs/tasks/directory. - Ingestion points:
scripts/ralph.pyreads filenames from a tasks directory and prompts the agent to implement their content. - Boundary markers: Absent. There are no delimiters or instructions to the agent to treat the spec content as untrusted data.
- Capability inventory: When running with the recommended bypass flags, the agent has full filesystem and execution access to the repository and host environment.
- Sanitization: None. The script does not validate the content of the specifications before directing the agent to follow them.
- [REMOTE_CODE_EXECUTION] (HIGH): By design, the skill executes code generated or suggested by an AI based on untrusted inputs. In an adversarial scenario, a spec file could contain instructions to execute malicious payloads, which the agent would perform due to the disabled security sandbox.
- [DATA_EXPOSURE] (MEDIUM): The script writes agent-generated "learnings" to
AGENTS.md. This creates a secondary injection surface where a malicious spec can poison the knowledge base for all future agent runs, leading to persistent compromise.
Recommendations
- AI detected serious security threats
Audit Metadata