codex-cto

Fail

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION]: The shell script scripts/cto-invoke.sh is vulnerable to shell command injection. It uses the eval command to execute a string ($CODEX_CMD) that is constructed by interpolating the user-provided objective ($PROMPT) with only basic double-quoting. This allows an attacker or a malicious objective to escape the intended command context using standard shell meta-characters (e.g., ;, $(...), or `) and execute arbitrary code on the host system.
  • [REMOTE_CODE_EXECUTION]: The skill architecture establishes a 'plan-execute-review' loop where the agent is explicitly instructed to perform high-privilege actions—including run_command, create_file, and modify_file—based on the contents of a JSON file returned by an external tool (codex exec). This creates a significant indirect prompt injection surface: if the external model's output is compromised or manipulated, the agent will execute the resulting malicious commands or file changes without manual human review or safety validation.
  • [EXTERNAL_DOWNLOADS]: The skill requires the installation of an external NPM package (@openai/codex) and references scripts from a separate skill directory (codex-orchestrator), introducing dependencies on external code that is not contained within the skill's own repository.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 6, 2026, 11:25 AM