llm-evaluation
Fail
Audited by Socket on Mar 8, 2026
1 alert found:
Obfuscated FileObfuscated FileSKILL.md
HIGHObfuscated FileHIGH
SKILL.md
The skill presents a coherent, well-scoped framework for evaluating LLM applications via automated metrics, human judgments, and LLM-based assessments. It does not require dangerous downloads, credential access, or remote action, and its data flows are contained to evaluated inputs and produced metrics/annotations. Overall risk is low with respect to security and data privacy; the footprint is proportionate to its stated purpose as an evaluation framework.
Confidence: 98%
Audit Metadata