mapping-documents
Pass
Audited by Gen Agent Trust Hub on May 7, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill uses well-established libraries like
pdfplumberand the officialanthropicSDK, which are standard for document processing and AI integration tasks.\n- [EXTERNAL_DOWNLOADS]: The skill connects to the Anthropic API for its primary functionality of semantic extraction. Anthropic is a well-known service provider, and this communication is necessary and intended for the skill's stated purpose.\n- [DATA_EXFILTRATION]: The skill handles API keys according to security best practices, allowing keys to be passed via environment variables or CLI flags rather than being hardcoded in the scripts.\n- [INDIRECT_PROMPT_INJECTION]: As the skill ingests and processes content from untrusted PDF files, it possesses an indirect prompt injection surface. A malicious PDF could attempt to influence the agent by embedding instructions in the extracted text. The skill implements several mitigation strategies:\n - Ingestion points:
scripts/docmap.pyreads PDF content usingpdfplumber.open.\n - Boundary markers: The semantic extraction prompts use structured labels (
SECTION,PAGES,TEXT) and enforce JSON response schemas to maintain separation between data and instructions.\n - Capability inventory: The skill is restricted to file operations (writing maps and indexes) and network communication with the Anthropic API; it does not execute code extracted from the documents.\n
- Sanitization: The
_normalize_symbolfunction uses Unicode NFKC normalization to ensure that extracted symbols are consistent and to prevent homoglyph substitution attacks.
Audit Metadata