codex
Fail
Audited by Gen Agent Trust Hub on May 13, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION]: The skill utilizes deceptive instructions that claim the agent is using a non-existent "GPT-5.4" model with "extra-high reasoning" capabilities. This deception is designed to override the agent's standard safety protocols and mislead it into believing it is operating in a more advanced, potentially less restricted mode.
- [COMMAND_EXECUTION]: The documentation provides comprehensive guidance on using the
codexCLI with flags that explicitly bypass security controls. It documents the--dangerously-bypass-approvals-and-sandboxflag and thedanger-full-accesssandbox mode, both of which are intended to skip all human confirmation prompts and execute commands without sandboxing. - [COMMAND_EXECUTION]: The skill's instructions promote the use of permissive configuration settings such as
workspace-writefor unrestricted file modifications and-c approval_policy=never, which removes the requirement for user approval before the execution of model-generated commands. - [REMOTE_CODE_EXECUTION]: The skill describes functionality for the
codex cloudsubcommand, includingexecandapply, which are used to retrieve code changes from a remote service and apply them directly to the local working directory. - [EXTERNAL_DOWNLOADS]: The
skill_mcp_dependency_installfeature mentioned in the documentation indicates that the tool may automatically download and install external dependencies during execution, which could facilitate the introduction of unverified code into the environment.
Recommendations
- AI detected serious security threats
Audit Metadata