icml-reviewer

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) due to its core function of processing untrusted external content. 1. Ingestion points: Paper text, PDFs, and local code repositories defined in SKILL.md Step 1. 2. Boundary markers: Absent; no delimiters or instructions to ignore embedded commands are present. 3. Capability inventory: File system access for repository exploration and WebSearch for literature grounding. 4. Sanitization: Absent; the skill does not filter or escape processed content.
  • DATA_EXFILTRATION (MEDIUM): The skill exhibits an exfiltration risk by combining sensitive data access (reading local repositories) with network capabilities (WebSearch). An attacker-controlled repository could trigger the agent to leak sensitive file contents found during exploration through search queries or metadata fetches.
  • COMMAND_EXECUTION (LOW): The instructions for 'Repository Review Mode' encourage the agent to explore and analyze code scripts. This creates an environment where an agent might be induced to execute malicious scripts found within an analyzed repository, especially if the underlying LLM is permitted to run code to verify findings.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:59 AM