orchestrator
Warn
Audited by Gen Agent Trust Hub on Mar 16, 2026
Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill explicitly instructs users to bypass safety guardrails by using the '--allow-all' flag and '/allow-all' command. This disables the requirement for human approval before the agent performs sensitive operations, effectively removing the 'human-in-the-loop' safety mechanism for the duration of the pipeline execution.
- [COMMAND_EXECUTION]: The skill advocates for running the agent in an unrestricted mode, which permits the execution of arbitrary shell commands and filesystem modifications. The workflow includes automated file creation (e.g., in 'docs/tasks/') and git commits performed by subagents without user oversight or intervention.
- [PROMPT_INJECTION]: The skill processes untrusted user input (the 'OBJECTIVE') and passes it directly to multiple subagents (researcher, architect, ralph). A malicious objective could manipulate these subagents into performing unauthorized actions, a risk magnified by the explicit instructions to disable safety filters. Ingestion points: User-provided 'OBJECTIVE' and 'CONTEXT' fields in the invocation prompt. Boundary markers: Absent; user input is directly interpolated into subagent instructions without delimiters or 'ignore' warnings. Capability inventory: Automated file writes, directory creation, and git commits across several phases. Sanitization: None; the input is used as-is for synthesis and task generation.
Audit Metadata