doc-to-vector-dataset-generator

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [Indirect Prompt Injection] (LOW): The skill ingests untrusted external documents which serve as an attack surface for malicious instructions that could influence downstream LLM systems.
  • Ingestion points: Files are read via extract_pdf and extract_markdown in SKILL.md.
  • Boundary markers: The pipeline lacks delimiters or explicit instructions to ignore embedded commands within the processed text chunks.
  • Capability inventory: The skill performs local file reading (open, glob, pymupdf.open) and writing (export_jsonl).
  • Sanitization: While clean_text performs regex-based noise reduction, it does not sanitize or escape content against adversarial prompt injection patterns.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:41 PM