codex-review
Fail
Audited by Gen Agent Trust Hub on Mar 31, 2026
Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill is highly vulnerable to shell command injection. In Step 3, user-provided arguments (e.g., a model name override via
/codex-review gpt-5.4) are interpolated directly into a shell command (codex exec -m [MODEL]). An attacker can provide a string containing shell metacharacters like semicolons or backticks to execute arbitrary code on the host system.\n- [COMMAND_EXECUTION]: In Step 6, the skill constructs a shell command (codex exec resume) that includes summaries of "Changes made" and "Skipped" rationale. If these strings contain shell-active characters and are not properly escaped by the agent, they create a secondary command injection vector.\n- [EXTERNAL_DOWNLOADS]: The skill instructs users to install an unverified and non-standard npm package,@openai/codex, globally. This package does not correspond to official OpenAI public releases and could contain malicious code or mismanage the API credentials it requires.\n- [DATA_EXFILTRATION]: The core functionality involves reading local project context, implementation plans, and git diff outputs, and sending them to an external third-party API via the codex CLI. This poses a risk of exposing sensitive intellectual property or secrets contained within the code.\n- [PROMPT_INJECTION]: The skill is susceptible to Indirect Prompt Injection through the processing of untrusted data.\n - Ingestion points: Untrusted data enters the context via git diff outputs, project documentation (CLAUDE.md), and implementation plans.\n
- Boundary markers: The skill lacks explicit boundary markers or instructions to the model to ignore embedded commands within the content being reviewed.\n
- Capability inventory: The skill has the capability to execute shell commands, read local files, and perform network operations.\n
- Sanitization: No sanitization or filtering of external content is performed before it is passed to the reviewing model.
Recommendations
- AI detected serious security threats
Audit Metadata