paper-to-code

Pass

Audited by Gen Agent Trust Hub on Apr 21, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted research paper content which is then used to generate executable code, scripts, and configuration files.
  • Ingestion points: External research paper content enters the agent context via the $0 argument and is interpolated into the {paper_content} placeholder in the planning prompts.
  • Boundary markers: The prompts lack explicit delimiters or instructions to ignore embedded instructions within the research paper, which could allow adversarial content in a paper to influence code generation.
  • Capability inventory: The skill has the capability to write multiple files (including reproduce.sh and requirements.txt) and includes a debugging workflow intended to execute the generated code and resolve errors.
  • Sanitization: No sanitization or validation is performed on the input paper content before it is processed by the model for code generation.
  • [COMMAND_EXECUTION]: The skill's primary workflow involves generating and executing a reproduction shell script (reproduce.sh) and a Python environment. While this is the intended functionality, the potential for executing code influenced by untrusted input (the research paper) warrants caution.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 21, 2026, 07:28 AM