paper2code
Fail
Audited by Gen Agent Trust Hub on Mar 5, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: The skill is designed to download content from remote URLs (arXiv) and then generate and execute code based on that content. This behavior allows for the potential execution of harmful code if the downloaded paper contains malicious technical specifications intended to exploit the agent's code generation logic.
- [COMMAND_EXECUTION]: The skill utilizes a high-privilege execution environment, invoking tools like
curl,pdftotext,uv,python, andbashto interact with the system. The04_implementation_guide.mdexplicitly instructs the agent to use these tools to 'reproduce the paper directly', which includes package installation and script execution. - [EXTERNAL_DOWNLOADS]: The skill fetches PDF files from arXiv's public repository. Additionally, the
05_reference_search.mdprompt encourages the agent to search for and study implementation designs from unvetted third-party GitHub repositories. - [PROMPT_INJECTION]: There is a significant risk of Indirect Prompt Injection as the skill processes untrusted external data and maps it to high-impact system capabilities.
- Ingestion points: Research papers are retrieved via
curland converted to text for processing (SKILL.md). - Boundary markers: The skill lacks explicit delimiters or instructions to prevent the agent from following directives found within the paper text.
- Capability inventory: The agent has extensive permissions to write files, install packages (
uv add), and execute code (uv run,bash) (04_implementation_guide.md). - Sanitization: No validation or sanitization is performed on the content extracted from the paper before it influences the code generation process.
Recommendations
- HIGH: Downloads and executes remote code from: https://arxiv.org/pdf/xxxx.xxxxx.pdf - DO NOT USE without thorough review
- AI detected serious security threats
Audit Metadata