codex
Fail
Audited by Gen Agent Trust Hub on Feb 20, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): User-provided task descriptions are directly interpolated into the instructions for the autonomous subagent. An attacker could craft a task that overrides the subagent's constraints to perform malicious actions like deleting files, exfiltrating data, or establishing persistence.
- [COMMAND_EXECUTION] (HIGH): The skill executes the
codexCLI with--full-autoby default, granting it the ability to modify the workspace autonomously. This bypasses typical human-in-the-loop safety checks for AI-driven modifications. - [PRIVILEGE_ESCALATION] (HIGH): The skill explicitly includes a
danger-full-accesssandbox mode. While the skill claims this should be used with caution, its presence provides a direct path for the agent to acquire full system permissions. - [INDIRECT_PROMPT_INJECTION] (LOW): The skill gathers context from the environment (git status, diffs, logs). If the repository contains malicious content in file names, commit messages, or code, it could influence the subagent's behavior.
- Ingestion points:
git status,git diff,git logoutput. - Boundary markers: The prompt uses
<context>,<task>,<constraints>, and<output>tags, which help but do not eliminate the risk of adversarial input. - Capability inventory: The subagent has
Bash,Read,Grep, andGlobtools, and can modify files. - Sanitization: No explicit sanitization of the gathered git context or user input is performed before interpolation into the prompt.
Recommendations
- AI detected serious security threats
Audit Metadata