llamaguard

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [EXTERNAL_DOWNLOADS] (SAFE): The skill downloads model weights and tokenizers from HuggingFace (meta-llama/LlamaGuard-7b). HuggingFace is a trusted platform, and the Meta-Llama organization is a verified trusted source.
  • [CREDENTIALS_UNSAFE] (SAFE): No hardcoded API keys or secrets were found. The skill correctly follows security best practices by instructing the user to use the huggingface-cli login command for authentication.
  • [COMMAND_EXECUTION] (SAFE): Shell commands are restricted to standard package management (pip install) and official tool authentication. No suspicious piped execution or arbitrary command strings were detected.
  • [INDIRECT_PROMPT_INJECTION] (SAFE): Although the skill is designed to process untrusted user content for moderation, it uses apply_chat_template for safe formatting and lacks any high-privilege capabilities (such as shell access or file writing) that could be exploited by malicious input data.
  • [DATA_EXFILTRATION] (SAFE): No unauthorized network operations or data exfiltration patterns were observed. Network activity is limited to downloading model assets and standard API serving via FastAPI.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 04:58 PM