codex-skill
Fail
Audited by Gen Agent Trust Hub on Mar 9, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONREMOTE_CODE_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill mandates autonomous execution, directing the agent to complete tasks without seeking approval and avoid confirmation for standard operations. It highlights a dangerous flag --dangerously-bypass-approvals-and-sandbox which explicitly skips all security confirmations and sandboxing. The agent is also highly susceptible to indirect prompt injection.
- Ingestion points: Reads and processes untrusted workspace files via Read, Grep, and Bash tools.
- Boundary markers: No delimiters or 'ignore embedded instructions' warnings are provided to the agent.
- Capability inventory: The agent has high-privilege capabilities including full Bash execution, package management (npm, brew), and system-wide file access.
- Sanitization: No mechanisms are specified to sanitize, escape, or validate the content of processed code.
- [COMMAND_EXECUTION]: The agent leverages the codex CLI and various permitted Bash tools (npm, brew, cat, etc.) to run arbitrary shell commands autonomously. The 'Danger-Full-Access Mode' explicitly allows system-level operations outside the workspace.
- [EXTERNAL_DOWNLOADS]: The skill facilitates the installation of the @openai/codex package and manages system dependencies using brew and npm. While the package originates from a well-known service (OpenAI), the autonomous nature of the installation increases the risk profile.
- [DATA_EXFILTRATION]: The 'Danger-Full-Access Mode' grants permissions for network communication and access to all files on the system, creating a significant risk of sensitive data exposure or exfiltration if the agent's logic is subverted.
- [REMOTE_CODE_EXECUTION]: The core functionality of the skill is to interpret instructions and execute code locally via the codex CLI, which effectively acts as a remote code execution engine with access to system resources.
Recommendations
- AI detected serious security threats
Audit Metadata