autoresearchclaw-autonomous-research

Fail

Audited by Gen Agent Trust Hub on Mar 16, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTIONCREDENTIALS_UNSAFE
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The installation instructions direct users to clone a repository from an unverified GitHub account (aiming-lab/AutoResearchClaw) and run pip install -e .. This execution of setup scripts from a non-whitelisted source allows for arbitrary code execution on the user's system during the installation phase.\n- [EXTERNAL_DOWNLOADS]: The skill fetches its primary application logic from an untrusted GitHub repository and subsequently makes network requests to various academic databases (arXiv, Semantic Scholar) and the open web during its research phases.\n- [COMMAND_EXECUTION]: The skill features a 23-stage pipeline that includes the automatic generation and execution of Python code (CODE_GENERATION and EXPERIMENT_RUN). While labeled as sandboxed, this execution occurs in the local environment and can be configured to bypass human approval gates using the --auto-approve flag. Additionally, the configuration supports persistence via use_cron and parallel process spawning.\n- [PROMPT_INJECTION]: The skill is highly susceptible to indirect prompt injection. It ingests untrusted data from literature APIs and web fetching (LITERATURE_COLLECT, use_web_fetch) without documented boundary markers or sanitization. This data is used to synthesize knowledge that informs the CODE_GENERATION stage, creating a path for external content to influence local code execution. Capabilities include code execution, file writes, and network access.\n- [CREDENTIALS_UNSAFE]: The skill's configuration requires high-privilege API keys for LLM providers (e.g., OpenAI, OpenRouter) to be stored in environment variables. These keys are accessible to the skill's processes, including any dynamically generated and executed experiment scripts.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 16, 2026, 02:35 PM