The Agent Skills Directory

PROMPT_INJECTION (HIGH): The skill implements a Question Answering system that is susceptible to Indirect Prompt Injection (Category 8). \n
Ingestion points: In examples/paper_content_processor.py, the download_and_process_paper method retrieves PDFs from external URLs and extracts all text content using PyPDF2. \n
Boundary markers: In examples/paper_question_answerer.py, the generate_answer method interpolates this raw text into the LLM prompt using only a simple 'Context from research papers:' header as a delimiter. This lacks robust isolation or instructions to ignore embedded commands. \n
Capability inventory: The agent uses the extracted content to synthesize findings and answer user questions, which allows an attacker to influence the agent's persona, reasoning, and output by uploading a malicious paper to ArXiv. \n
Sanitization: No filtering, escaping, or validation is performed on the extracted PDF text before prompt interpolation. \n- EXTERNAL_DOWNLOADS (MEDIUM): The skill performs automated network downloads of external PDF files from non-whitelisted domains. \n
Evidence: In examples/paper_content_processor.py, requests.get(pdf_url) is used to fetch content from URLs provided by the ArXiv API. ArXiv is not a whitelisted source in the security policy, and the lack of domain validation for the pdf_url could allow connections to malicious sites if the metadata is manipulated.

chat-with-arxiv