The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The script scripts/pdf_to_images.py downloads PDF files from user-provided URLs using urllib.request.urlopen (line 49). \n- [COMMAND_EXECUTION]: The script scripts/pdf_to_images.py executes the pdftoppm system utility via subprocess.run (line 120) to convert PDF pages into images. Arguments are constructed using a sanitization function (sanitize_name) to reduce the risk of path traversal or command injection from URL-derived filenames. \n- [PROMPT_INJECTION]: The skill presents an attack surface for indirect prompt injection (Category 8) as it processes untrusted visual data from images and PDFs without explicit security delimiters. \n
Ingestion points: scripts/pdf_to_images.py (downloads) and scripts/qianfan_ocr_cli.py (image data extraction). \n
Boundary markers: Absent; the prompts provided in the references/ directory do not utilize explicit boundary markers or instructions to disregard embedded commands in document text. \n
Capability inventory: The skill can write files to the local filesystem and execute system utilities (pdftoppm) via subprocess calls. \n
Sanitization: Filename sanitization is performed, but document content is passed to the VLM without filtering. \n- [COMMAND_EXECUTION]: Multiple runner scripts (e.g., scripts/run_document_parsing.py) utilize importlib.util to dynamically load local utility modules from the skill's own directory. Although the paths are restricted to the local scripts/ folder, this involves dynamic code loading patterns.

qianfanocr-document-intelligence