multimodal-llm
Pass
Audited by Gen Agent Trust Hub on Apr 17, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill facilitates the ingestion of untrusted external data such as images, PDFs, and audio files, which is a common vector for indirect prompt injection.
- Ingestion points: Technical guides in
rules/vision-image-analysis.md,rules/vision-document.md, andrules/audio-speech-to-text.mddemonstrate reading and processing user-provided media. - Boundary markers: The provided prompt templates do not include delimiters or specific instructions (e.g., 'ignore embedded commands') to prevent the model from obeying instructions hidden within the media content.
- Capability inventory: The integration patterns utilize high-privilege SDKs (Anthropic, OpenAI, Google) and network tools, increasing the impact of a successful injection.
- Sanitization: No sanitization, validation, or escaping logic is recommended for the content extracted from processed media before it is used in downstream prompts.
Audit Metadata