skills/bbuf/sglang-auto-driven-skills/llm-serving-auto-benchmark/Socket

llm-serving-auto-benchmark

Warn

Audited by Socket on May 3, 2026

4 alerts found:

Anomalyx4

Anomalyconfigs/cookbook-llm/llama-4-maverick-17b-128e-instruct-fp8.yaml

LOW

AnomalyLOW

configs/cookbook-llm/llama-4-maverick-17b-128e-instruct-fp8.yaml

This YAML fragment is not inherently malicious; it is a multi-framework LLM serving/benchmark configuration. The primary security issue is enabling trust_remote_code: true for SGLang, vLLM, and TensorRT-LLM while loading a model/tokenizer from an external repository reference. That combination can allow arbitrary code execution during model/artifact initialization if the artifacts are not pinned and verified. Other elements (benchmark endpoint, dataset parameters, and output_dir writes) appear consistent with standard benchmarking and do not show explicit exfiltration or credential theft within this snippet.

Confidence: 66%Severity: 68%

Anomalyconfigs/cookbook-llm/minimax-m2.5.yaml

LOW

AnomalyLOW

configs/cookbook-llm/minimax-m2.5.yaml

No overt malware is present in this YAML snippet. However, it materially increases supply-chain execution risk by enabling trust_remote_code: true for multiple serving frameworks while referencing a third-party model/tokenizer (MiniMaxAI/MiniMax-M2.5). If any backend honors this flag by executing repository-provided code during model loading, compromise of the model artifacts could lead to arbitrary code execution. Aside from this, the remainder of the config is standard for serving/benchmarking and includes no explicit data theft or exfiltration behaviors.

Confidence: 62%Severity: 63%

Anomalyconfigs/cookbook-llm/step-3.5-flash.yaml

LOW

AnomalyLOW

configs/cookbook-llm/step-3.5-flash.yaml

This configuration does not show direct malware behaviors within the YAML itself, but it materially increases supply-chain risk by enabling trust_remote_code: true for multiple LLM serving frameworks. That setting can allow execution of model-repository-provided code during model resolution/loading. The remaining behaviors (server startup, /v1/completions benchmarking, and writing results to output_dir) appear operationally typical, with secondary risks around benchmark artifact reuse (search.resume). Treat the referenced model artifacts as high-risk unless pinned to an immutable revision and verified/sandboxed according to your environment’s security controls.

Confidence: 62%Severity: 67%

Anomalyconfigs/cookbook-llm/kimi-linear-48b-a3b-instruct.yaml

LOW

AnomalyLOW

configs/cookbook-llm/kimi-linear-48b-a3b-instruct.yaml

No overt malicious payload is visible in this YAML fragment; it is an orchestration/benchmark configuration. The major concern is that trust_remote_code: true is enabled for multiple inference frameworks while loading a remote model/tokenizer from an external repository, which can allow repository-supplied code to execute during model initialization. Treat as a significant supply-chain execution risk and mitigate by pinning exact model revisions/digests, verifying artifact provenance, and disabling trust_remote_code unless strictly necessary.

Confidence: 72%Severity: 66%

Audit Metadata

Analyzed At

May 3, 2026, 03:26 AM

Package URL

pkg:socket/skills-sh/BBuf%2FSGLang-Auto-Driven-SKILLS%2Fllm-serving-auto-benchmark%2F@ddf72a9926f236161848ffa0e2d1bb862dd9423b