local-benchmarks
Pass
Audited by Gen Agent Trust Hub on May 5, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill makes extensive use of
npx whatcanirun, which downloads and executes the benchmarking tool directly from the NPM registry. - [COMMAND_EXECUTION]: Shell commands including
ls,find,unzip, andpython3are used to discover local models, inspect metadata, and parse benchmark result files. - [DATA_EXFILTRATION]: The skill provides an explicit workflow for users to 'submit' their benchmark results to the external service
whatcani.run. This includes hardware specifications, model metadata, and performance metrics. - [REMOTE_CODE_EXECUTION]: While
npxtechnically executes remote code, in this context, it is used to run the vendor's primary tool for the skill's stated purpose. - [INDIRECT_PROMPT_INJECTION]: The skill ingests untrusted data in the form of local file paths and JSON metadata from ZIP bundles.
- Ingestion points:
SKILL.md(viafindandlson model directories and readingmanifest.jsonfrom ZIP bundles). - Boundary markers: None identified in the provided instructions.
- Capability inventory: Shell command execution (
npx,unzip,find,python3). - Sanitization: Not explicitly implemented in the provided shell scripts.
Audit Metadata