codex-issue-plan-execute

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill implements a multi-agent autonomous workflow that processes external inputs (GitHub/tracker issues) which may contain malicious instructions.
  • Ingestion points: Issues are ingested in Phase 2 ('Planning Pipeline') from external sources.
  • Boundary markers: No explicit delimiters (e.g., XML tags) or 'ignore embedded instructions' prompts are documented for the interpolation of issue content into the agent context.
  • Capability inventory: The skill possesses high-privilege tools including Bash, Write, and Task, and explicitly manages an 'Execution Agent' designed to implement changes.
  • Sanitization: There is no evidence of validation, escaping, or filtering of the issue content before it is transformed into a 'solution' and executed.
  • [Command Execution] (MEDIUM): The skill utilizes the Bash tool for directory setup and likely for implementation in Phase 3. The use of Bash in an autonomous mode (Execution Mode: Autonomous) combined with external data ingestion creates a direct path for Remote Code Execution (RCE) if an attacker-controlled issue can influence the commands generated by the Planning Agent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 06:27 AM