pdf-extract-experimental-materials

Pass

Audited by Gen Agent Trust Hub on Apr 9, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill presents a surface for indirect prompt injection because it processes untrusted content from PDFs and Markdown files without implementing boundary markers. Maliciously crafted documents containing hidden instructions could potentially influence the agent's behavior during the extraction process. Ingestion points: The skill accepts PDF-derived Markdown and text (e.g., input.pdf, out.md) as primary inputs. Boundary markers: The instructions lack delimiters (like XML tags or triple quotes) to isolate data from agent instructions. Capability inventory: Across the skill, the agent is granted capabilities to write multiple CSV files (main_reagents.csv, main_instruments.csv, reagent_preparation.csv) to the local file system. Sanitization: There is no evidence of sanitization or filtering of the extracted text before it is processed by the model.\n- [COMMAND_EXECUTION]: The skill instructions reference an absolute local path for a script (d:\SKILL\project\pdf-extract\scripts\extract_pdf.py). Using hardcoded absolute paths to execute scripts is a poor security practice that can lead to failure or unexpected behavior if the target environment differs from the developer's system or if an attacker can manipulate files at that specific location.\n- [SAFE]: The declared Python dependencies, pdfplumber and pytesseract, are standard and well-known libraries for PDF parsing and OCR tasks.\n- [SAFE]: The provided validation script (scripts/validate_skill.py) and the audit record JSON file do not contain malicious code, hidden instructions, or obfuscated patterns.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 9, 2026, 01:09 AM