codex-cto
Fail
Audited by Gen Agent Trust Hub on Mar 6, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: The shell script
scripts/cto-invoke.shis vulnerable to shell command injection. It uses theevalcommand to execute a string ($CODEX_CMD) that is constructed by interpolating the user-provided objective ($PROMPT) with only basic double-quoting. This allows an attacker or a malicious objective to escape the intended command context using standard shell meta-characters (e.g.,;,$(...), or`) and execute arbitrary code on the host system. - [REMOTE_CODE_EXECUTION]: The skill architecture establishes a 'plan-execute-review' loop where the agent is explicitly instructed to perform high-privilege actions—including
run_command,create_file, andmodify_file—based on the contents of a JSON file returned by an external tool (codex exec). This creates a significant indirect prompt injection surface: if the external model's output is compromised or manipulated, the agent will execute the resulting malicious commands or file changes without manual human review or safety validation. - [EXTERNAL_DOWNLOADS]: The skill requires the installation of an external NPM package (
@openai/codex) and references scripts from a separate skill directory (codex-orchestrator), introducing dependencies on external code that is not contained within the skill's own repository.
Recommendations
- AI detected serious security threats
Audit Metadata