tbench
Pass
Audited by Gen Agent Trust Hub on Feb 28, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The documentation includes instructions for installing the
huggingface_hubPython package and cloning a dataset repository from HuggingFace. These references target a well-known service for the purpose of submitting and retrieving benchmark results. - [COMMAND_EXECUTION]: The skill provides numerous shell command examples for executing benchmarks via
make, managing GitHub Action workflows viagh, and performing data analysis withbq. These commands are standard for a technical benchmarking and CI/CD environment. - [SAFE]: No security threats were detected. Analysis of the provided markdown content found no evidence of prompt injection, data exfiltration, or obfuscated malicious code. Credentials such as API keys are represented by placeholders, and the primary functionality aligns with the stated purpose of agent evaluation.
Audit Metadata