document-ocr-processing

Pass

Audited by Gen Agent Trust Hub on Mar 1, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: Indirect prompt injection vulnerability surface detected in the OCR processing workflow.
  • Ingestion points: The skill ingests untrusted data from image files via cv2.imread and pytesseract.image_to_string within the ChuukeseOCRProcessor and ChuukeseOCRPostProcessor classes.
  • Boundary markers: There are no delimiters or 'ignore' instructions implemented to prevent the agent from potentially following malicious instructions embedded within the scanned images.
  • Capability inventory: The skill facilitates file system reads for image processing and text extraction.
  • Sanitization: The extracted text is processed for character correction but is not sanitized or escaped to prevent command-like behavior if the output is passed to an LLM.
  • [COMMAND_EXECUTION]: The skill depends on the execution of an external system binary.
  • The pytesseract library functions as a wrapper that invokes the tesseract OCR engine on the host system. This involves spawning a subprocess to execute the binary, which is a standard but noteworthy behavior requiring appropriate environment permissions.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 1, 2026, 01:10 AM