llm-serving-auto-benchmark
Audited by Socket on May 3, 2026
4 alerts found:
Anomalyx4This YAML fragment is not inherently malicious; it is a multi-framework LLM serving/benchmark configuration. The primary security issue is enabling trust_remote_code: true for SGLang, vLLM, and TensorRT-LLM while loading a model/tokenizer from an external repository reference. That combination can allow arbitrary code execution during model/artifact initialization if the artifacts are not pinned and verified. Other elements (benchmark endpoint, dataset parameters, and output_dir writes) appear consistent with standard benchmarking and do not show explicit exfiltration or credential theft within this snippet.
No overt malware is present in this YAML snippet. However, it materially increases supply-chain execution risk by enabling trust_remote_code: true for multiple serving frameworks while referencing a third-party model/tokenizer (MiniMaxAI/MiniMax-M2.5). If any backend honors this flag by executing repository-provided code during model loading, compromise of the model artifacts could lead to arbitrary code execution. Aside from this, the remainder of the config is standard for serving/benchmarking and includes no explicit data theft or exfiltration behaviors.
This configuration does not show direct malware behaviors within the YAML itself, but it materially increases supply-chain risk by enabling trust_remote_code: true for multiple LLM serving frameworks. That setting can allow execution of model-repository-provided code during model resolution/loading. The remaining behaviors (server startup, /v1/completions benchmarking, and writing results to output_dir) appear operationally typical, with secondary risks around benchmark artifact reuse (search.resume). Treat the referenced model artifacts as high-risk unless pinned to an immutable revision and verified/sandboxed according to your environment’s security controls.
No overt malicious payload is visible in this YAML fragment; it is an orchestration/benchmark configuration. The major concern is that trust_remote_code: true is enabled for multiple inference frameworks while loading a remote model/tokenizer from an external repository, which can allow repository-supplied code to execute during model initialization. Treat as a significant supply-chain execution risk and mitigate by pinning exact model revisions/digests, verifying artifact provenance, and disabling trust_remote_code unless strictly necessary.