autoresearch-fleet

Fail

Audited by Gen Agent Trust Hub on Apr 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The scripts/launch.sh script generates an orchestrator.sh file by interpolating variables from fleet.json (such as EVAL_CMD, FLEET_NAME, and METRIC_REGEX) into a shell script using an unquoted heredoc. This creates a shell injection vulnerability where a maliciously crafted configuration can execute arbitrary commands during the orchestrator's initialization or runtime.
  • [REMOTE_CODE_EXECUTION]: The skill's core execution engine in lib/worker-spawn.sh uses the --dangerously-skip-permissions flag when invoking the AI agent CLI. This bypasses the platform's security prompts, allowing the autonomous agent to perform file modifications and execute system commands without human verification.
  • [REMOTE_CODE_EXECUTION]: The orchestrator script utilizes eval to execute the constructed AGENT_CMD and the user-supplied EVAL_CMD. Combined with the lack of sanitization and the skipped permission checks, this significantly increases the risk of arbitrary code execution if the agent is compromised.
  • [PROMPT_INJECTION]: The skill is highly vulnerable to indirect prompt injection. It is designed to ingest data from results.tsv and external sources via WebSearch. Malicious content from these sources could influence the agent's behavior, which is particularly dangerous given the agent's instructions to 'NEVER STOP' and its autonomous tool access.
  • Ingestion points: results.tsv (via tail), WebSearch output, and the mutable solution.py file.
  • Boundary markers: Absent. The agent is encouraged to read these files as raw context.
  • Capability inventory: Shell access (Bash), file system modification (Write), and internet access (WebSearch).
  • Sanitization: None. The orchestrator directly interpolates agent outputs and external data into prompts for subsequent iterations.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 17, 2026, 11:08 AM