prompt-guard
Pass
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: LOWEXTERNAL_DOWNLOADS
Full Analysis
- [EXTERNAL_DOWNLOADS] (LOW): The skill uses the transformers library to download model weights from the meta-llama/Prompt-Guard-86M repository on Hugging Face. Under [TRUST-SCOPE-RULE], since meta-llama is a trusted organization, the severity of the external download is downgraded to LOW.
- [INDIRECT_PROMPT_INJECTION] (LOW): The skill functions in SKILL.md (get_jailbreak_score, filter_third_party_data, batch_filter_documents) ingest untrusted data from users or RAG sources. 1. Ingestion points: Input parameters of classification functions. 2. Boundary markers: Absent. 3. Capability inventory: No dangerous side effects (no file writing, no network exfiltration, no subprocess execution). 4. Sanitization: None, as the skill itself is a classification-based sanitizer. Given the limited capabilities, this is assessed at the LOW severity tier.
Audit Metadata