ralph-mode
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: CRITICALCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [Command Execution] (CRITICAL): The skill implements a bash-based autonomous loop (ralph.sh) that executes arbitrary commands. Evidence: Documentation in SKILL.md explicitly recommends using the --dangerously-skip-permissions flag for autonomous mode, which removes all safety guardrails and allows the agent to execute any command on the host system without human intervention.
- [Indirect Prompt Injection] (HIGH): The skill is highly vulnerable to malicious instructions embedded in the data it processes. Ingestion points: prd.json (user stories). Boundary markers: Absent; user stories are interpolated directly into worker prompts. Capability inventory: Full shell access, git operations, and filesystem writes. Sanitization: None.
- [Remote Code Execution] (HIGH): The system is designed to allow an LLM to write and execute code autonomously. Evidence: Local mode uses bash to pipe prompts directly into the agent, which then modifies the local environment and executes commands in a loop until tasks are marked as passing.
- [Data Exposure & Exfiltration] (MEDIUM): The autonomous nature and lack of permission constraints allow for easy data theft. Evidence: An injected task could read sensitive files (such as .env, SSH keys, or AWS credentials) and use network tools like curl to exfiltrate them.
Recommendations
- AI detected serious security threats
Audit Metadata