The Agent Skills Directory

[PROMPT_INJECTION]: Indirect prompt injection vulnerability surface detected in the OCR processing workflow.
Ingestion points: The skill ingests untrusted data from image files via cv2.imread and pytesseract.image_to_string within the ChuukeseOCRProcessor and ChuukeseOCRPostProcessor classes.
Boundary markers: There are no delimiters or 'ignore' instructions implemented to prevent the agent from potentially following malicious instructions embedded within the scanned images.
Capability inventory: The skill facilitates file system reads for image processing and text extraction.
Sanitization: The extracted text is processed for character correction but is not sanitized or escaped to prevent command-like behavior if the output is passed to an LLM.
[COMMAND_EXECUTION]: The skill depends on the execution of an external system binary.
The pytesseract library functions as a wrapper that invokes the tesseract OCR engine on the host system. This involves spawning a subprocess to execute the binary, which is a standard but noteworthy behavior requiring appropriate environment permissions.

document-ocr-processing