experiment-bridge

Warn

Audited by Gen Agent Trust Hub on Apr 19, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill supports cloning arbitrary external repositories via a user-defined base repository URL.
  • Evidence: git clone <BASE_REPO> base_repo/ in Phase 2.
  • [REMOTE_CODE_EXECUTION]: The skill implements a 'generate-and-run' pattern where it writes Python scripts based on external documentation and then executes them.
  • Evidence: Phase 2 ('Implement missing pieces') describes writing training and evaluation scripts, which are then executed in Phase 3 and Phase 4 using the /run-experiment and /experiment-queue tools.
  • [COMMAND_EXECUTION]: The skill executes shell commands that may be derived from or influenced by the contents of the experiment plan.
  • Evidence: Phase 3 executes /run-experiment [sanity experiment command] based on the milestone order parsed from EXPERIMENT_PLAN.md.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection (Category 8) because it ingests untrusted Markdown files to drive code generation and execution.
  • Ingestion points: EXPERIMENT_PLAN.md, FINAL_PROPOSAL.md, IDEA_CANDIDATES.md, and IDEA_REPORT.md (Phase 1 and Phase 2).
  • Boundary markers: None identified; the skill does not use delimiters or instructions to ignore embedded commands within the processed files.
  • Capability inventory: Access to Bash(*), Write, Edit, and execution tools like /run-experiment and /experiment-queue across all phases.
  • Sanitization: No validation or escaping of the input Markdown content is performed before it is used to generate executable scripts.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 19, 2026, 03:14 AM