ai-paper-reproduction

Fail

Audited by Gen Agent Trust Hub on Apr 1, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/orchestrate_repro.py uses subprocess.run to execute command strings that are parsed directly from the README files of external, untrusted repositories.\n- [REMOTE_CODE_EXECUTION]: By design, the skill fetches and executes code and shell commands from remote sources, allowing for the execution of arbitrary instructions provided by an external party.\n- [PROMPT_INJECTION]: The skill is vulnerable to Indirect Prompt Injection (Category 8) because it treats the instructions and code blocks found in external, untrusted documentation as trusted input for execution.\n- [PROMPT_INJECTION]: Mandatory Evidence Chain for Indirect Prompt Injection:\n
  • Ingestion points: External README.md files processed by repo-intake-and-plan and orchestrate_repro.py.\n
  • Boundary markers: Absent; the skill does not use delimiters or instructions to ignore malicious commands embedded in the README content.\n
  • Capability inventory: The skill has the capability to execute shell commands via subprocess.run and manage repository patches.\n
  • Sanitization: None; the skill uses shlex.split for parsing but performs no validation, whitelisting, or sandboxing of the commands extracted from the remote source.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 1, 2026, 07:21 PM