ralphmode
Warn
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill provides instructions to bypass platform-level safety features such as human-in-the-loop approval prompts by utilizing configurations like 'bypassPermissions' (Claude Code) and 'approval_policy = "never"' (Codex CLI).
- Evidence: SKILL.md and references/permission-profiles.md detail instructions for 'automation with fewer approval prompts' and 'full bypass' modes for disposable sandboxes.
- [COMMAND_EXECUTION]: Suggests configurations that allow for the autonomous execution of shell commands and tools without explicit user confirmation for each action.
- Evidence: references/permission-profiles.md provides configuration snippets for the '--dangerously-skip-permissions' flag and the 'sandbox_mode = "danger-full-access"' setting.
- [PROMPT_INJECTION]: The skill creates a surface for indirect prompt injection by relying on 'prompt contracts' (textual instructions for the agent) to enforce safety boundaries on platforms that lack native hook-based blocking.
- Ingestion points: Agent context receives untrusted data from repository files via Read and Bash tools as defined in SKILL.md.
- Boundary markers: The skill suggests using the 'CHECKPOINT_NEEDED' string as an instruction-based delimiter for safety halts in references/permission-profiles.md.
- Capability inventory: The agent is granted Read, Write, Bash, Grep, and Glob tools across all files in the workspace (SKILL.md).
- Sanitization: No sanitization, escaping, or validation of external data content is implemented beyond the instructional checkpoint rule.
Audit Metadata