protocol-entity-extraction
Pass
Audited by Gen Agent Trust Hub on Mar 6, 2026
Risk Level: SAFE
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection (Category 8) as it processes untrusted clinical trial documents (PDF/DOCX) and interpolates the extracted text directly into prompts (e.g., in
prompts/endpoints.mdandprompts/study_design.md). - Ingestion points: Document content is read in
scripts/parse_protocol.pyusingget_document_parser(). - Boundary markers: Uses triple-dash delimiters (
---) to separate instructions from document text, which provides basic but not foolproof structural separation. - Capability inventory: The skill has file-write capabilities (
scripts/parse_protocol.py) and utilizes an external API for data processing. - Sanitization: No explicit sanitization or filtering of the document text is performed before prompt interpolation.
- [DATA_EXPOSURE]: The skill requires a
UNSTRUCTURED_API_KEYfor cloud-based document parsing. This involves sending document content to the Unstructured.io service. This is documented as a core feature and represents normal functional behavior for high-fidelity document extraction. - [COMMAND_EXECUTION]: The Python scripts (
scripts/parse_protocol.pyandscripts/merge_entities.py) perform standard file I/O and process orchestration without any arbitrary command execution or shell injection vectors.
Audit Metadata