skills/wshobson/agents/llm-evaluation/Gen Agent Trust Hub

llm-evaluation

Pass

Audited by Gen Agent Trust Hub on Mar 7, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: No malicious patterns, obfuscation, or unauthorized data access attempts were detected in the skill content.
  • [EXTERNAL_DOWNLOADS]: The skill includes code snippets that use the transformers and detoxify libraries to download pre-trained models from trusted sources such as Hugging Face, Microsoft, and Facebook. These are standard operations for the described machine learning evaluation tasks.
  • [PROMPT_INJECTION]: The skill implements 'LLM-as-Judge' patterns where untrusted data (model outputs) are interpolated into evaluation prompts (e.g., in llm_judge_quality). While this introduces a surface for indirect prompt injection, it is a standard design pattern for LLM evaluation. The skill includes notes on best practices like human validation to mitigate these risks.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 7, 2026, 05:10 PM