The Agent Skills Directory

[Indirect Prompt Injection] (LOW): The skill ingests untrusted external documents which serve as an attack surface for malicious instructions that could influence downstream LLM systems.
Ingestion points: Files are read via extract_pdf and extract_markdown in SKILL.md.
Boundary markers: The pipeline lacks delimiters or explicit instructions to ignore embedded commands within the processed text chunks.
Capability inventory: The skill performs local file reading (open, glob, pymupdf.open) and writing (export_jsonl).
Sanitization: While clean_text performs regex-based noise reduction, it does not sanitize or escape content against adversarial prompt injection patterns.

doc-to-vector-dataset-generator