benchmark-agents

Warn

Audited by Socket on Mar 15, 2026

1 alert found:

Anomaly
AnomalyLOW
SKILL.md

SUSPICIOUS: the skill's behavior largely matches its stated benchmarking purpose, but it materially expands agent authority by installing remote plugin code from GitHub, spawning autonomous interactive Claude sessions, and inspecting session logs/artifacts. The footprint is coherent for an internal eval harness, yet the install trust and multi-agent execution model make it high-risk operationally rather than clearly malicious.

Confidence: 82%Severity: 68%
Audit Metadata
Analyzed At
Mar 15, 2026, 06:31 PM
Package URL
pkg:socket/skills-sh/vercel-labs%2Fvercel-plugin%2Fbenchmark-agents%2F@99fceb4488f4427e29f03b91b2e46b0f738f13ff