The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The skill fetches model weights and source code from official repositories on GitHub and HuggingFace belonging to DeepSeek-AI. It also downloads official pre-built binaries for the vLLM project from GitHub Releases.
[REMOTE_CODE_EXECUTION]: The model loading process utilizes the trust_remote_code=True parameter in the Transformers library. This is a standard and necessary requirement for executing the custom architecture code provided by the model authors in the HuggingFace repository.
[COMMAND_EXECUTION]: Installation instructions include commands for setting up Python environments and installing specialized performance libraries like flash-attn and CUDA-optimized versions of PyTorch.
[INDIRECT_PROMPT_INJECTION]: As the skill is designed to extract text from external images and PDF files, it possesses a surface for indirect prompt injection. Malicious text embedded within processed documents could potentially influence subsequent agent logic.
Ingestion points: Path parameters in functions such as batch_ocr(image_dir, ...) and pdf_to_markdown(pdf_path).
Boundary markers: None identified; the extracted text is processed directly as strings without explicit delimiters in the provided examples.
Capability inventory: Includes file system access (read/write) and model inference capabilities.
Sanitization: No specific filtering or validation of the text output from the OCR process is demonstrated in the implementation snippets.

deepseek-ocr