The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Fetches pre-trained machine learning models for the docling and marker backends from repositories hosted by IBM and datalab-to upon first use.
[COMMAND_EXECUTION]: Provides a command-line interface extract-pdfs and utilizes system tools like pdftotext for specific extraction tasks.
[PROMPT_INJECTION]: The skill processes PDF documents which creates an attack surface for indirect prompt injection. If an agent extracts content from a document containing hidden instructions, it may attempt to follow them.
Ingestion points: Reads local PDF files provided by the user or external sources (extractors.py).
Boundary markers: Absent; the skill does not wrap extracted text in delimiters to segregate it from agent instructions.
Capability inventory: The skill has filesystem write access in extractors.py and can execute subprocesses through backends.py.
Sanitization: Absent; extracted text is not validated or sanitized before being returned to the agent context.

pdf-extractor