codex

Fail

Audited by Gen Agent Trust Hub on Mar 1, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill directs the agent to construct shell commands by piping raw user prompts into the CLI tool (e.g., echo "your prompt here" | codex exec ...). This pattern allows for arbitrary shell command injection if the user input contains control characters like semicolons, ampersands, or backticks.
  • [REMOTE_CODE_EXECUTION]: The skill specifically facilitates the use of the --sandbox danger-full-access flag. This capability grants the Codex tool unconstrained access to the host file system and network, significantly escalating the impact of any code execution or command injection.
  • [COMMAND_EXECUTION]: All recommended commands are suffixed with 2>/dev/null by default. This systematic suppression of the standard error stream prevents the user from seeing warnings, permission denials, or diagnostic output that would indicate a security breach or tool malfunction.
  • [PROMPT_INJECTION]: The instructions for handling disagreements ('When Codex is Wrong') encourage the agent to relay external information and reasoning into the execution environment via shell pipes. This creates a surface for indirect injection where malicious content from web searches or other external sources could be executed by the Codex tool.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 1, 2026, 11:13 AM