paper-to-code
Pass
Audited by Gen Agent Trust Hub on Apr 21, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted research paper content which is then used to generate executable code, scripts, and configuration files.
- Ingestion points: External research paper content enters the agent context via the
$0argument and is interpolated into the{paper_content}placeholder in the planning prompts. - Boundary markers: The prompts lack explicit delimiters or instructions to ignore embedded instructions within the research paper, which could allow adversarial content in a paper to influence code generation.
- Capability inventory: The skill has the capability to write multiple files (including
reproduce.shandrequirements.txt) and includes a debugging workflow intended to execute the generated code and resolve errors. - Sanitization: No sanitization or validation is performed on the input paper content before it is processed by the model for code generation.
- [COMMAND_EXECUTION]: The skill's primary workflow involves generating and executing a reproduction shell script (
reproduce.sh) and a Python environment. While this is the intended functionality, the potential for executing code influenced by untrusted input (the research paper) warrants caution.
Audit Metadata