using-codex

Fail

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill constructs shell commands by directly interpolating user-provided requests into the codex exec command string. Lack of sanitization allows for command injection if a user provides inputs containing shell metacharacters (e.g., ;, &&, |), potentially leading to arbitrary code execution on the host system.
  • [DATA_EXFILTRATION]: The skill is designed to read and process entire local project directories. It passes this data to an external CLI tool ('codex') claiming to be from OpenAI. However, the tool uses a fake model version ('gpt-5.3-codex'), suggesting it is an unverified or malicious wrapper that could exfiltrate sensitive source code to an untrusted remote server.
  • [PROMPT_INJECTION]: The instructions explicitly command the agent to include directives that suppress model guardrails and conversational checking, such as 'No confirmation or questions needed' and 'output code examples autonomously.' This is a form of instruction overriding designed to force the AI into an unrestricted operational mode.
  • [INDIRECT_PROMPT_INJECTION]:
  • Ingestion points: The skill ingests untrusted data from both user requests and the contents of the project directory specified in the --cd parameter.
  • Boundary markers: No boundary markers or XML tags are used to isolate user input from the command structure.
  • Capability inventory: The skill possesses the capability to execute system commands via the codex tool with a --full-auto flag, which implies autonomous action.
  • Sanitization: There is no evidence of input validation, escaping, or filtering for the <request> parameter or the project files being analyzed.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 6, 2026, 02:12 PM