Arxiv Paper Reader

Pass

Audited by Gen Agent Trust Hub on Feb 19, 2026

Risk Level: SAFEDATA_EXFILTRATION
Full Analysis
  • Indirect Prompt Injection (LOW): The skill processes untrusted external data by fetching LaTeX source and abstracts from ArXiv. This content is interpolated directly into LLM prompts in agents/reader_agent.py and agents/summary_agent.py.
  • Ingestion points: ArXiv paper metadata (title, abstract) and LaTeX source code are retrieved via the arxiv and arxiv-to-prompt libraries in main.py.
  • Boundary markers: The prompts use XML-like tags (e.g., <initial_summary>) for structured internal data, but external paper content is inserted into the prompt template without strong delimiters or instructions to ignore embedded commands.
  • Capability inventory: The agent does not have access to any tools (the tools parameter in create_agent is effectively null in existing calls). Its only output is printed text.
  • Sanitization: The latex_parser.py file contains a _clean_latex function, but it only performs layout cleanup (removing labels and whitespace) and does not sanitize the text for malicious instructions targeting the LLM.
  • Data Exposure (LOW): The skill manages an LLM_API_KEY via a .env file and config.py. While no secrets are hardcoded in the provided files, the skill's architecture requires the user to store sensitive credentials in a local environment file.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 19, 2026, 01:36 PM