Arxiv Paper Reader
Pass
Audited by Gen Agent Trust Hub on Feb 19, 2026
Risk Level: SAFEDATA_EXFILTRATION
Full Analysis
- Indirect Prompt Injection (LOW): The skill processes untrusted external data by fetching LaTeX source and abstracts from ArXiv. This content is interpolated directly into LLM prompts in
agents/reader_agent.pyandagents/summary_agent.py. - Ingestion points: ArXiv paper metadata (title, abstract) and LaTeX source code are retrieved via the
arxivandarxiv-to-promptlibraries inmain.py. - Boundary markers: The prompts use XML-like tags (e.g.,
<initial_summary>) for structured internal data, but external paper content is inserted into the prompt template without strong delimiters or instructions to ignore embedded commands. - Capability inventory: The agent does not have access to any tools (the
toolsparameter increate_agentis effectively null in existing calls). Its only output is printed text. - Sanitization: The
latex_parser.pyfile contains a_clean_latexfunction, but it only performs layout cleanup (removing labels and whitespace) and does not sanitize the text for malicious instructions targeting the LLM. - Data Exposure (LOW): The skill manages an
LLM_API_KEYvia a.envfile andconfig.py. While no secrets are hardcoded in the provided files, the skill's architecture requires the user to store sensitive credentials in a local environment file.
Audit Metadata