huggingface-community-evals
Pass
Audited by Gen Agent Trust Hub on Mar 23, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [Shell Command Execution]: The skill uses
subprocess.run()to execute evaluation CLI tools likeinspectandlighteval. This is a standard pattern for Python scripts acting as wrappers for CLI frameworks. - Evidence found in
scripts/inspect_eval_uv.py,scripts/inspect_vllm_uv.py, andscripts/lighteval_vllm_uv.py. - [Remote Code Execution via Model Loading]: The scripts include a
--trust-remote-codeflag. When enabled by the user, this allows the underlying libraries (transformers,vllm) to execute custom code provided by model authors on the Hugging Face Hub. This is a common but sensitive feature in the machine learning ecosystem required for certain model architectures. - Documentation and implementation found in
SKILL.mdand all script files. - [Credential Management]: The skill correctly advises users to manage their Hugging Face tokens via environment variables (
HF_TOKEN) and provides a template inexamples/.env.example. This follows security best practices for secret management by avoiding hardcoded keys. - Evidence in
examples/.env.exampleand environment setup functions in scripts.
Audit Metadata