autonomous-orchestration

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The trigger_research_cycle function in reference/failure-recovery.md constructs a prompt using $(get_attempt_history "$issue"). This history often includes logs or previous worker outputs which are untrusted. If these logs contain 'Ignore previous instructions' or other jailbreak patterns, the research agent's behavior can be overridden.
  • INDIRECT_PROMPT_INJECTION (HIGH):
  • Ingestion points: The skill reads failure logs and issue history in reference/failure-recovery.md via get_attempt_history and get_worker_issue.
  • Boundary markers: None. The untrusted data is directly concatenated into the markdown-formatted prompt.
  • Capability inventory: The system can revert pull requests (rollback_pr), modify project boards (update_project_status), and spawn new workers with full-auto capabilities (spawn_worker).
  • Sanitization: No evidence of sanitization, filtering, or escaping for the interpolated content.
  • COMMAND_EXECUTION (MEDIUM): The skill uses codex exec to run agents with --full-auto permissions. While the research agent is labeled read-only, its output (the 'research context') is fed into the next worker, which has significantly higher privileges, including git push and gh project item-edit.
  • DATA_EXPOSURE (LOW): The enter_sleep function posts internal state, including orchestration IDs and PR lists, to public GitHub issue comments. While expected for this workflow, it increases the metadata available to potential attackers.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 06:53 AM