The Agent Skills Directory

[COMMAND_EXECUTION]: The script scripts/pdf_ocr_processor.py uses subprocess.check_call to execute pip install commands. This allows the skill to modify the runtime environment by executing system-level commands to manage Python packages.
[EXTERNAL_DOWNLOADS]: The skill makes network requests to the SiliconFlow API (https://api.siliconflow.cn/v1/chat/completions) using the requests library to perform OCR via cloud-based models.
[REMOTE_CODE_EXECUTION]: Through its automatic dependency installation mechanism, the skill downloads and executes code from the Python Package Index (PyPI) at runtime, which is an unverifiable method of managing dependencies.
[DATA_EXFILTRATION]: User-provided document content is converted to base64 and transmitted to an external service (SiliconFlow) for OCR processing. While this is the intended functionality, it involves sending potentially sensitive local data to a third-party API.
[PROMPT_INJECTION]: The skill processes untrusted PDF and image files, making it susceptible to indirect prompt injection where malicious text in documents could influence the calling agent's behavior. \n
Ingestion points: The processor reads local PDF and image files provided to the PDFOCRProcessor.pdf_to_images and PDFOCRProcessor.ocr_image_file functions. \n
Boundary markers: The SiliconFlowOCREngine.recognize function uses a system prompt that instructs the model to "output pure text format" and "not add any extra explanation," which provides a basic but incomplete boundary. \n
Capability inventory: The skill has network access (requests) and the ability to execute system commands (subprocess). \n
Sanitization: There is no evidence of sanitization, escaping, or filtering of the extracted text content before it is returned to the agent context.

pdf-ocr-skill