experiment-pipeline

Warn

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The instructions in SKILL.md and references/stage-protocols.md (Stage 1) explicitly direct the agent to find 'original baseline code' from 'official repos' or 're-implementations' and get the code running in the local environment. This process involves downloading and executing unverified code from external third-party sources.
  • [COMMAND_EXECUTION]: The skill implements a core 'generate → execute' loop across all stages, where the agent is instructed to write and execute code for experiments using the execute tool. This results in the execution of arbitrary, dynamically generated logic that is not known at the time of analysis.
  • [PROMPT_INJECTION]: The skill is designed to ingest and process data from external research papers, proposals, and prior experiment memories (such as /memory/experiment-memory.md). It lacks instructions for sanitizing these inputs or using boundary markers, making it susceptible to indirect prompt injection if those external materials contain malicious directives.
  • [COMMAND_EXECUTION]: The references/stage-protocols.md file suggests resolving dependencies and fixing compatibility issues for external code, which often involves running package managers and environment setup commands that could be exploited by malicious repository configurations.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 14, 2026, 01:07 PM