protocol-entity-extraction

Pass

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection (Category 8) as it processes untrusted clinical trial documents (PDF/DOCX) and interpolates the extracted text directly into prompts (e.g., in prompts/endpoints.md and prompts/study_design.md).
  • Ingestion points: Document content is read in scripts/parse_protocol.py using get_document_parser().
  • Boundary markers: Uses triple-dash delimiters (---) to separate instructions from document text, which provides basic but not foolproof structural separation.
  • Capability inventory: The skill has file-write capabilities (scripts/parse_protocol.py) and utilizes an external API for data processing.
  • Sanitization: No explicit sanitization or filtering of the document text is performed before prompt interpolation.
  • [DATA_EXPOSURE]: The skill requires a UNSTRUCTURED_API_KEY for cloud-based document parsing. This involves sending document content to the Unstructured.io service. This is documented as a core feature and represents normal functional behavior for high-fidelity document extraction.
  • [COMMAND_EXECUTION]: The Python scripts (scripts/parse_protocol.py and scripts/merge_entities.py) perform standard file I/O and process orchestration without any arbitrary command execution or shell injection vectors.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 6, 2026, 04:19 PM