codex-issue-plan-execute
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill implements a multi-agent autonomous workflow that processes external inputs (GitHub/tracker issues) which may contain malicious instructions.
- Ingestion points: Issues are ingested in Phase 2 ('Planning Pipeline') from external sources.
- Boundary markers: No explicit delimiters (e.g., XML tags) or 'ignore embedded instructions' prompts are documented for the interpolation of issue content into the agent context.
- Capability inventory: The skill possesses high-privilege tools including
Bash,Write, andTask, and explicitly manages an 'Execution Agent' designed to implement changes. - Sanitization: There is no evidence of validation, escaping, or filtering of the issue content before it is transformed into a 'solution' and executed.
- [Command Execution] (MEDIUM): The skill utilizes the
Bashtool for directory setup and likely for implementation in Phase 3. The use ofBashin an autonomous mode (Execution Mode: Autonomous) combined with external data ingestion creates a direct path for Remote Code Execution (RCE) if an attacker-controlled issue can influence the commands generated by the Planning Agent.
Recommendations
- AI detected serious security threats
Audit Metadata