researchclaw
Warn
Audited by Gen Agent Trust Hub on Apr 6, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill generates Python scripts (e.g.,
experiment.py) at runtime and executes them locally using subprocesses when thesandboxmode is enabled in the configuration.- [REMOTE_CODE_EXECUTION]: The pipeline supports anssh_remotemode that facilitates the execution of generated code on remote GPU servers via SSH, extending the potential impact of malicious code to external infrastructure.- [PROMPT_INJECTION]: The skill exhibits a surface for indirect prompt injection because the autonomous agent consumes untrusted input (the research topic) and uses it to generate executable code. There is no evidence of a human-in-the-loop review for the generated code before it is executed. - Ingestion points: Untrusted data enters the pipeline through the
--topicCLI argument or theconfig.yamlfile. - Boundary markers: The instructions lack explicit delimiters or safety warnings to ensure the LLM ignores instructions embedded within the research topic.
- Capability inventory: The skill possesses extensive capabilities including file system writes, local bash execution, and remote SSH access.
- Sanitization: No sanitization or validation mechanisms are described for the generated Python scripts prior to execution.
Audit Metadata