agentbench
Warn
Audited by Socket on Mar 15, 2026
1 alert found:
AnomalyAnomalySKILL.md
LOWAnomalyLOW
SKILL.md
SUSPICIOUS: the stated purpose matches a benchmark runner, and there is no clear credential harvesting or covert exfiltration. Risk is medium because the skill directs the agent to execute repository-provided tasks and shell scripts, potentially process external content, and take broad actions across many workflows without a strong signed-release trust model.
Confidence: 81%Severity: 58%
Audit Metadata