paper2code
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION] (HIGH): The file
04_implementation_guide.mdcontains instructions that command the agent to bypass human-in-the-loop oversight. Specifically, it states '파일 간 허락을 구하지 말 것' (Do not ask permission between files) and '완료될 때까지 계속 구현' (Continue implementing until finished). This is a direct attempt to override standard AI safety guardrails for tool usage. - [REMOTE_CODE_EXECUTION] (HIGH): The skill's primary function is to generate and execute code based on external, untrusted research papers. Automated scans identified an RCE pattern where a file is downloaded from a remote source (arXiv) and subsequently staged for execution via the agent's bash and python tools.
- [COMMAND_EXECUTION] (HIGH): The skill relies on direct shell command execution for environment setup and file management, using commands like
uv init,uv add, andcatautonomously. Phase 4 mandates the use of these tools to 'directly replicate' papers without user intervention. - [EXTERNAL_DOWNLOADS] (MEDIUM): Phase 0 (
05_reference_search.md) directs the agent to search for and evaluate third-party code repositories on GitHub. While it includes basic advice on checking licenses, it encourages the ingestion and potential execution of unverified external code. - [INDIRECT_PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to indirect prompt injection because it consumes untrusted data (papers) and possesses high-privilege write/execute capabilities. It lacks boundary markers and sanitization, and requires the agent to 'accurately copy' technical details, which could include malicious payloads. (Ingestion: Research papers; Boundaries: Missing; Capabilities: Shell/Python execution; Sanitization: None).
Recommendations
- CRITICAL: Downloads and executes remote code from untrusted source(s): https://arxiv.org/pdf/xxxx.xxxxx.pdf - DO NOT USE
- AI detected serious security threats
Audit Metadata