autoresearch
Fail
Audited by Gen Agent Trust Hub on May 3, 2026
Risk Level: HIGHPROMPT_INJECTIONDATA_EXFILTRATIONCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The instructions direct the agent to override standard operational constraints and safety protocols by continuing operations indefinitely without seeking user permission. Evidence: The 'NEVER STOP' section and the description of the agent as a 'tireless researcher, not an assistant waiting for permission' command the agent to disregard conversational pauses.- [DATA_EXFILTRATION]: The skill performs automated discovery of sensitive files and credentials on the host system. Evidence: The Phase 0 (SCOUT) logic utilizes the Glob tool to identify files matching patterns such as '.env*', 'config/', and '*.key'.- [COMMAND_EXECUTION]: The skill executes arbitrary shell commands provided through configuration parameters. Evidence: The Bash tool is used to execute the 'eval_harness' and 'checks_script' which are defined by user-provided strings or auto-detected commands.- [REMOTE_CODE_EXECUTION]: The skill establishes an autonomous loop that modifies source code and immediately executes the changes. Evidence: The 'LOOP FOREVER' routine uses the Edit tool to modify target files and then executes them via the Bash tool without intervening human review.- [PROMPT_INJECTION]: The skill ingests untrusted code and log data into the prompt context to guide autonomous actions, creating a surface for indirect prompt injection. Evidence: The agent reads target source files and execution logs to hypothesize improvements. Ingestion points: target files; Boundary markers: absent; Capability inventory: Bash, Edit, Write; Sanitization: absent.
Recommendations
- AI detected serious security threats
Audit Metadata