The Agent Skills Directory

[COMMAND_EXECUTION]: The utility scripts run_accuracy_test.sh and run_performance_test.sh use the sed command to programmatically modify Python source files (.py) at runtime to inject configuration parameters like host IPs and ports.
Evidence: sed -i.bak operations in scripts/run_accuracy_test.sh targeting files returned by the ais_bench --search command.
[EXTERNAL_DOWNLOADS]: The skill requires downloading the benchmark tool and various datasets from external sources including GitHub and Alibaba Cloud OSS.
Evidence: git clone https://github.com/AISBench/benchmark.git and dataset URLs from opencompass.oss-cn-shanghai.aliyuncs.com.
[REMOTE_CODE_EXECUTION]: The tool exposes a configuration option to trust remote code from model repositories, which allows the execution of arbitrary code bundled with model weights during the loading process.
Evidence: The trust_remote_code parameter is present and configurable in assets/model_config_template.py and references/model-configs.md.

ais-bench