ais-bench
Warn
Audited by Gen Agent Trust Hub on Feb 25, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The utility scripts
run_accuracy_test.shandrun_performance_test.shuse thesedcommand to programmatically modify Python source files (.py) at runtime to inject configuration parameters like host IPs and ports. - Evidence:
sed -i.bakoperations inscripts/run_accuracy_test.shtargeting files returned by theais_bench --searchcommand. - [EXTERNAL_DOWNLOADS]: The skill requires downloading the benchmark tool and various datasets from external sources including GitHub and Alibaba Cloud OSS.
- Evidence:
git clone https://github.com/AISBench/benchmark.gitand dataset URLs fromopencompass.oss-cn-shanghai.aliyuncs.com. - [REMOTE_CODE_EXECUTION]: The tool exposes a configuration option to trust remote code from model repositories, which allows the execution of arbitrary code bundled with model weights during the loading process.
- Evidence: The
trust_remote_codeparameter is present and configurable inassets/model_config_template.pyandreferences/model-configs.md.
Audit Metadata