doc-to-vector-dataset-generator
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [Indirect Prompt Injection] (LOW): The skill ingests untrusted external documents which serve as an attack surface for malicious instructions that could influence downstream LLM systems.
- Ingestion points: Files are read via
extract_pdfandextract_markdowninSKILL.md. - Boundary markers: The pipeline lacks delimiters or explicit instructions to ignore embedded commands within the processed text chunks.
- Capability inventory: The skill performs local file reading (
open,glob,pymupdf.open) and writing (export_jsonl). - Sanitization: While
clean_textperforms regex-based noise reduction, it does not sanitize or escape content against adversarial prompt injection patterns.
Audit Metadata