paddleocr-doc-parsing

Pass

Audited by Gen Agent Trust Hub on Mar 29, 2026

Risk Level: SAFE
Full Analysis
  • [DATA_EXFILTRATION]: The skill transmits document content to a user-configured API endpoint via HTTPS. This is the core functionality of the skill. Sensitive credentials like PADDLEOCR_ACCESS_TOKEN are managed via environment variables, adhering to security best practices.\n- [PROMPT_INJECTION]: The skill acts as a data ingestion point, parsing text from external PDFs and images. This introduces a surface for indirect prompt injection if the processed documents contain malicious instructions intended to manipulate the agent. \n
  • Ingestion points: scripts/vl_caller.py processes local files via --file-path or remote files via --file-url.\n
  • Boundary markers: No explicit delimiters or instructions are used to wrap the extracted text.\n
  • Capability inventory: The skill uses httpx for network requests and interacts with the filesystem to read inputs and save results.\n
  • Sanitization: Extracted content is returned without sanitization or filtering.\n- [COMMAND_EXECUTION]: The skill executes local Python scripts to interact with the PaddleOCR API. These scripts perform well-defined tasks like file reading, HTTP requests, and image optimization without using dangerous primitives like shell execution of user input.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 29, 2026, 02:30 PM