reviewing-plans-with-codex

Fail

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses the Bash tool to execute a command that directly interpolates the {plan_file_fullpath} variable into a string. A malicious user could provide a file path containing shell control characters (e.g., ;, &&, or backticks) to execute arbitrary commands on the host system.
  • [REMOTE_CODE_EXECUTION]: The vulnerability in the bash command construction allows for remote code execution on the environment where the agent's tools are running.
  • [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface. It reads content from a user-specified file and passes it to an external LLM-based CLI tool (codex). Malicious instructions embedded within the Markdown file could influence the output and behavior of the external reviewer model.
  • Ingestion points: INSTRUCTIONS.md (via the Read tool used on {plan_file_path})
  • Boundary markers: Absent (file content is passed to the tool without delimiters or instructions to ignore embedded commands)
  • Capability inventory: Bash tool (used to execute the codex CLI tool)
  • Sanitization: Absent (no validation or escaping of the file content is performed before processing)
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 6, 2026, 02:13 PM