paddleocr-doc-parsing
Pass
Audited by Gen Agent Trust Hub on Mar 29, 2026
Risk Level: SAFE
Full Analysis
- [DATA_EXFILTRATION]: The skill transmits document content to a user-configured API endpoint via HTTPS. This is the core functionality of the skill. Sensitive credentials like PADDLEOCR_ACCESS_TOKEN are managed via environment variables, adhering to security best practices.\n- [PROMPT_INJECTION]: The skill acts as a data ingestion point, parsing text from external PDFs and images. This introduces a surface for indirect prompt injection if the processed documents contain malicious instructions intended to manipulate the agent. \n
- Ingestion points: scripts/vl_caller.py processes local files via --file-path or remote files via --file-url.\n
- Boundary markers: No explicit delimiters or instructions are used to wrap the extracted text.\n
- Capability inventory: The skill uses httpx for network requests and interacts with the filesystem to read inputs and save results.\n
- Sanitization: Extracted content is returned without sanitization or filtering.\n- [COMMAND_EXECUTION]: The skill executes local Python scripts to interact with the PaddleOCR API. These scripts perform well-defined tasks like file reading, HTTP requests, and image optimization without using dangerous primitives like shell execution of user input.
Audit Metadata