codex-cli
Fail
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill is designed to run the
codex execcommand, and it explicitly promotes the use of the--full-autoflag. This allows an AI model to autonomously execute shell commands and modify local files without requiring user confirmation, which could be abused to perform destructive actions. - [DATA_EXFILTRATION]: The skill documents an unrestricted mode using the flag
--sandbox danger-full-access. This mode permits network access while the tool is operating on the local codebase, providing a direct mechanism for an AI agent to exfiltrate sensitive files, environment variables, or source code to external servers. - [REMOTE_CODE_EXECUTION]: By facilitating the execution of AI-generated shell commands on the host system, the skill introduces a risk of remote code execution. If the model generates malicious payloads—whether due to its own training or external influence—those payloads are executed with the permissions of the local user.
- [PROMPT_INJECTION]: The skill is highly susceptible to indirect prompt injection attacks. Ingestion points: The skill processes entire codebases and git diffs as input for analysis. Boundary markers: None; there are no instructions provided to the agent to distinguish between its system instructions and instructions embedded in the code being reviewed. Capability inventory: The tool can perform subprocess execution (
command_execution), file system modifications (file_change), and network-based operations (web_search,mcp_tool_call) as listed in the reference documentation. Sanitization: None; the skill provides a direct path for executing the model's output as system commands, meaning any malicious instruction successfully injected into the model's context will be executed directly.
Recommendations
- AI detected serious security threats
Audit Metadata