ralph-wiggum
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The skill implements an autonomous loop that reads and writes to state files in
.ralph/.\n - Ingestion points: The agent reads instructions and state from
.ralph/guardrails.md,.ralph/state.md, andRALPH_TASK.mdwithin the workspace.\n - Boundary markers: No explicit delimiters or XML-tagging are used to separate user-provided or externally-sourced content from agent instructions.\n
- Capability inventory: The skill is designed for autonomous software development, which requires writing files, running tests (subprocess calls), and executing bash/git commands.\n
- Sanitization: None. The 'signs' or guardrails are accumulated based on 'observed failures' without validation, meaning a malicious error message or code comment encountered during development could be promoted to a persistent system instruction.\n- [COMMAND_EXECUTION] (MEDIUM): The skill's fundamental purpose is to autonomously execute arbitrary commands for software development.\n
- Evidence: Documentation specifies usage of
bash,jq, andgitto 'implement, test, refine' projects via Cursor hooks.\n - Risk: While intended for productivity, the lack of human-in-the-loop verification for autonomous iterations creates a significant attack surface if the agent is directed to process untrusted codebases or external requirements.
Recommendations
- AI detected serious security threats
Audit Metadata