codex-code-review
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The helper script
scripts/review.shis vulnerable to command injection. It uses theevalcommand to execute a string built from unvalidated command-line arguments. An attacker providing a malicious string to parameters like--modelor--branchcould execute arbitrary shell commands. - Evidence:
eval "$cmd" "\"$prompt\""inscripts/review.sh. The variable$cmdis constructed directly from variables like$MODEL,$OUTPUT_FILE, and$BRANCHwhich are populated from user input without sanitization. - EXTERNAL_DOWNLOADS (MEDIUM): The documentation recommends installing
@openai/codexvia npm. This is not a known official package name for OpenAI (which typically usesopenai), posing a supply chain risk if a user installs a malicious package registered under this name. - Evidence:
npm install -g @openai/codexinreferences/codex_cli.md. - PROMPT_INJECTION (LOW): The skill is susceptible to Indirect Prompt Injection. It ingests untrusted data (git diffs and pull request content) and interpolates it directly into prompts without boundary markers or instructions to ignore embedded commands.
- Evidence (Mandatory for Cat 8):
- Ingestion points:
scripts/review.shfetches data viagit diffandgh pr diff. - Boundary markers: Absent. Data is appended directly to the
base_prompt. - Capability inventory: The skill can execute shell commands, write files via the
--outputflag, and access the network via the CLI tool. - Sanitization: None detected.
- DATA_EXFILTRATION (LOW): The skill reads local repository data and PR diffs to send them to an external LLM API. While functional, users should be aware that sensitive code is transmitted externally.
- Evidence:
gh pr diffandgit diffusage inscripts/review.sh.
Recommendations
- AI detected serious security threats
Audit Metadata