paper-to-code

Warn

Audited by Gen Agent Trust Hub on Feb 20, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Unverifiable Dependencies & Remote Code Execution] (MEDIUM): The skill workflow involves generating a 'reproduce.sh' script and a 'requirements.txt' file based on untrusted input (the research paper). Stage 4 of the workflow explicitly defines a 'Re-run until successful' debugging loop. This creates a direct path for executing arbitrary code that has been generated from an untrusted external source.
  • [Dynamic Execution] (MEDIUM): The skill's primary function is the runtime generation of executable Python files and shell scripts, followed by their execution in a feedback loop. This dynamic generation and execution of code based on untrusted data is a significant security risk if the input is not strictly validated.
  • [Indirect Prompt Injection] (LOW): The skill ingests complex external data (PDFs or URLs) which can contain embedded instructions designed to influence the code generation process.
  • Ingestion points: The '$0' argument in 'SKILL.md' and the '{paper_content}' placeholder in 'paper-to-code-prompts.md'.
  • Boundary markers: Absent. The prompts interpolate the paper content directly without using delimiters or instructions to ignore embedded commands.
  • Capability inventory: The skill has the capability to write multiple files to the filesystem and execute shell scripts. The 'reproduce.sh' generation combined with the Stage 4 execution loop provides a high-capability environment for an injection to exploit.
  • Sanitization: No sanitization or verification of the methodology described in the paper is performed before it is translated into code.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 20, 2026, 05:23 AM