codex
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The skill consistently mandates the use of the
--dangerously-bypass-approvals-and-sandboxflag in all its command patterns. This explicitly instructs the agent to execute code and system operations without the security boundaries or human-in-the-loop approvals normally required for AI-generated commands. - DATA_EXFILTRATION (HIGH): The combination of 'Capability 2: Web Search & Fetch' and the ability to reference local files using the
@filesyntax, while operating without a sandbox, creates a significant data exfiltration risk. An attacker could use indirect prompt injection to trick the model into reading sensitive files (e.g., SSH keys, credentials) and sending their contents to a remote URL via the fetch capability. - PROMPT_INJECTION (HIGH): The skill instructions function as a system-level override by directing the agent to always use bypass flags. This effectively disables the agent's internal safety filters and constraints regarding command execution and file system access.
- INDIRECT PROMPT INJECTION (LOW): This skill exhibits a large attack surface for indirect prompt injection.
- Ingestion points: The skill fetches web content and processes local files.
- Boundary markers: None are present in the command patterns to distinguish between instructions and data.
- Capability inventory: The
codexCLI has broad capabilities to read files, execute tasks, and make network requests. - Sanitization: No sanitization or validation of the fetched web content is performed before it is processed by the model with high-reasoning effort.
Recommendations
- AI detected serious security threats
Audit Metadata