skill-benchmark
Audited by Socket on Mar 11, 2026
1 alert found:
Obfuscated FileThe Skill Benchmark skill is conceptually benign and internally consistent with its stated purpose of evaluating other Claude Code skills. It relies on standard tooling and explicit sandboxing to separate with-skill and baseline runs, and it foregrounds deterministic checks plus LLN-based grading. NoCredential reading, no unverifiable binaries, and no external data exfiltration patterns are described. The primary security-conscious caveat is the potential risk from executing external commands in a sandboxed environment; this should be mitigated by strict isolation, proper permission handling, and auditable logs. Overall, classify as BENIGN with low to moderate risk considerations tied to operational tooling and configuration management.