autoresearch
Warn
Audited by Gen Agent Trust Hub on Mar 18, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes arbitrary shell commands provided by the user (METRIC_COMMAND) in an autonomous loop without a pause for approval between iterations. Evidence: 'Phase 3, Step 4: RUN
- Execute the metric command. Redirect output to run.log'.
- [REMOTE_CODE_EXECUTION]: The agent is instructed to autonomously generate and write code to the local filesystem and then execute that code via a metric command. This creates a cycle where generated code is run without manual review. Evidence: 'Phase 3, Step 2: EDIT
- Modify the in-scope file(s)... Step 4: RUN
- Execute the metric command'.
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it reads and analyzes code from the repository ('in-scope files') which is then used to generate the next iteration of logic. Malicious instructions inside the codebase could influence the agent's behavior during its autonomous 'THINK' phase. Evidence: 'Phase 2, Step 2: Read in-scope files' and 'Phase 3, Step 1: THINK
- Analyze previous results and the current code'.
- [INDIRECT_PROMPT_INJECTION_MANDATORY_EVIDENCE]: Ingestion points: 'Phase 2, Step 2: Read in-scope files' (SKILL.md). Boundary markers: None present. Capability inventory: 'Phase 3, Step 2: EDIT (Modify files)' and 'Phase 3, Step 4: RUN (Shell command execution)' (SKILL.md). Sanitization: No sanitization or filtering of file content is described before analysis.
Audit Metadata