autoresearch

Fail

Audited by Gen Agent Trust Hub on Mar 16, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The setup script and installation instructions in SKILL.md and scripts/setup.sh use a high-risk command pattern by piping a remote script directly to the shell (curl -LsSf https://astral.sh/uv/install.sh | sh). This is a common setup procedure for the well-known uv package manager provided by Astral. \n- [COMMAND_EXECUTION]: The primary function of the skill is to allow an AI agent to autonomously edit the train.py script and then execute it via uv run. This design enables the execution of dynamically generated code based on external directives, constituting a major security surface. \n- [PROMPT_INJECTION]: The framework uses program.md as the source of truth for the agent's research loop, which makes it vulnerable to indirect prompt injection. \n
  • Ingestion points: The agent reads research directives from program.md in SKILL.md and scripts/run-loop.sh to guide code changes in train.py. \n
  • Boundary markers: No explicit delimiters or boundary instructions are used to prevent the agent from following malicious commands embedded in the program.md file. \n
  • Capability inventory: The agent has access to Bash, Read, Write, Edit, Glob, Grep, and WebFetch tools (as specified in SKILL.md), and the shell scripts execute potentially dangerous operations like git reset and uv run. \n
  • Sanitization: There is no evidence of sanitization or safety checks performed on the directives sourced from program.md before they are used to generate executable Python code.
Recommendations
  • HIGH: Downloads and executes remote code from: https://astral.sh/uv/install.sh - DO NOT USE without thorough review
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 16, 2026, 08:04 PM