NYC

codex

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill consistently mandates the use of the --dangerously-bypass-approvals-and-sandbox flag in all its command patterns. This explicitly instructs the agent to execute code and system operations without the security boundaries or human-in-the-loop approvals normally required for AI-generated commands.
  • DATA_EXFILTRATION (HIGH): The combination of 'Capability 2: Web Search & Fetch' and the ability to reference local files using the @file syntax, while operating without a sandbox, creates a significant data exfiltration risk. An attacker could use indirect prompt injection to trick the model into reading sensitive files (e.g., SSH keys, credentials) and sending their contents to a remote URL via the fetch capability.
  • PROMPT_INJECTION (HIGH): The skill instructions function as a system-level override by directing the agent to always use bypass flags. This effectively disables the agent's internal safety filters and constraints regarding command execution and file system access.
  • INDIRECT PROMPT INJECTION (LOW): This skill exhibits a large attack surface for indirect prompt injection.
  • Ingestion points: The skill fetches web content and processes local files.
  • Boundary markers: None are present in the command patterns to distinguish between instructions and data.
  • Capability inventory: The codex CLI has broad capabilities to read files, execute tasks, and make network requests.
  • Sanitization: No sanitization or validation of the fetched web content is performed before it is processed by the model with high-reasoning effort.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:07 PM