ralphmode

Warn

Audited by Gen Agent Trust Hub on Mar 12, 2026

Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill provides instructions to bypass platform-level safety features such as human-in-the-loop approval prompts by utilizing configurations like 'bypassPermissions' (Claude Code) and 'approval_policy = "never"' (Codex CLI).
  • Evidence: SKILL.md and references/permission-profiles.md detail instructions for 'automation with fewer approval prompts' and 'full bypass' modes for disposable sandboxes.
  • [COMMAND_EXECUTION]: Suggests configurations that allow for the autonomous execution of shell commands and tools without explicit user confirmation for each action.
  • Evidence: references/permission-profiles.md provides configuration snippets for the '--dangerously-skip-permissions' flag and the 'sandbox_mode = "danger-full-access"' setting.
  • [PROMPT_INJECTION]: The skill creates a surface for indirect prompt injection by relying on 'prompt contracts' (textual instructions for the agent) to enforce safety boundaries on platforms that lack native hook-based blocking.
  • Ingestion points: Agent context receives untrusted data from repository files via Read and Bash tools as defined in SKILL.md.
  • Boundary markers: The skill suggests using the 'CHECKPOINT_NEEDED' string as an instruction-based delimiter for safety halts in references/permission-profiles.md.
  • Capability inventory: The agent is granted Read, Write, Bash, Grep, and Glob tools across all files in the workspace (SKILL.md).
  • Sanitization: No sanitization, escaping, or validation of external data content is implemented beyond the instructional checkpoint rule.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 12, 2026, 08:02 AM