autoresearch

Fail

Audited by Gen Agent Trust Hub on Mar 12, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill provides instructions and a setup script (setup.sh) to download the uv package manager installer from https://astral.sh/uv/install.sh and execute it via a shell pipe. Astral is a well-known service provider, and this is the official installation method for the tool.
  • [REMOTE_CODE_EXECUTION]: The autonomous research loop functions by having an AI agent edit a local script (train.py) and then execute it using the uv run command. This dynamic execution of agent-modified code is the core purpose of the skill for performing ML research.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it processes research directives from program.md that influence the agent's code modifications and experimental strategies.
  • Ingestion points: The agent reads directives from program.md (read-only) and modifies train.py (read/write) based on its interpretation.
  • Boundary markers: No explicit security delimiters or 'ignore embedded instructions' warnings are present to isolate the directive content from the agent's primary system instructions.
  • Capability inventory: The agent has access to Bash, Read, Write, Edit, Glob, Grep, and WebFetch tools, and can perform git commit, git reset, and uv run train.py.
  • Sanitization: The skill does not implement sanitization or validation for the contents of the program.md file before it is processed by the agent.
  • [COMMAND_EXECUTION]: The skill includes several utility shell scripts (check-hardware.sh, setup.sh, run-loop.sh) that perform system environment checks, repository cloning via git, and ML process execution.
Recommendations
  • HIGH: Downloads and executes remote code from: https://astral.sh/uv/install.sh - DO NOT USE without thorough review
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 12, 2026, 06:24 AM