NYC

advanced-evaluation

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE] (SAFE): No malicious code or instructions detected. The skill focuses on established AI evaluation methodologies.\n- [Indirect Prompt Injection] (SAFE): While the skill describes systems that ingest untrusted data (LLM outputs to be evaluated), it emphasizes strict rubrics, boundary markers, and bias mitigation protocols that align with security best practices for evaluation pipelines.\n- [Dependencies] (SAFE): Code snippets reference standard, well-known data science libraries including numpy, scipy, and scikit-learn for metric calculation. No suspicious external packages or remote downloads are present.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 05:28 PM