ralph

Warn

Audited by Gen Agent Trust Hub on Mar 9, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: Installation instructions in SKILL.md point to third-party GitHub repositories (github.com/Q00/ouroboros and github.com/supercent-io/skills-template) that are not part of the verified trusted vendors list.\n- [REMOTE_CODE_EXECUTION]: The skill suggests using 'npx' and 'gemini extensions install' to download and run code from unverified third-party sources.\n- [COMMAND_EXECUTION]: The setup script (scripts/setup-codex-hook.sh) modifies the user's ~/.codex/config.toml and creates new files in ~/.codex/prompts/ to persist skill behavior and developer instructions across sessions.\n- [PROMPT_INJECTION]: The 'Ralph' mode instructions (e.g., 'the boulder never stops', 'don't stop') promote persistent autonomous behavior that may attempt to override standard agent termination safety protocols.\n- [DATA_EXFILTRATION]: The skill architecture combines WebFetch and Bash capabilities within an autonomous loop, creating a potential path for data to be retrieved from the environment and sent externally if the agent's instructions are subverted.\n- [PROMPT_INJECTION]: The skill exposes a Category 8 (Indirect Prompt Injection) surface: it ingests external data via WebFetch (SKILL.md) and possesses Bash and Write capabilities (SKILL.md) with no documented boundary markers or content sanitization.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 9, 2026, 11:58 AM