The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The documentation includes instructions for installing the huggingface_hub Python package and cloning a dataset repository from HuggingFace. These references target a well-known service for the purpose of submitting and retrieving benchmark results.
[COMMAND_EXECUTION]: The skill provides numerous shell command examples for executing benchmarks via make, managing GitHub Action workflows via gh, and performing data analysis with bq. These commands are standard for a technical benchmarking and CI/CD environment.
[SAFE]: No security threats were detected. Analysis of the provided markdown content found no evidence of prompt injection, data exfiltration, or obfuscated malicious code. Credentials such as API keys are represented by placeholders, and the primary functionality aligns with the stated purpose of agent evaluation.

tbench