serving-llms-vllm
Warn
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The troubleshooting guide in
references/troubleshooting.mdsuggests usingsudo ufw allow 8000to open firewall ports. This command uses administrative privileges to modify the system's security configuration, potentially increasing the network attack surface.\n- [REMOTE_CODE_EXECUTION]: Instructions inSKILL.mdandreferences/troubleshooting.mdrecommend using the--trust-remote-codeflag to support custom models. This flag allows the execution of arbitrary Python code included in the model repository, which can lead to remote code execution if a model is loaded from an untrusted source.\n- [DATA_EXFILTRATION]: Deployment patterns inSKILL.mdandreferences/server-deployment.mdadvise binding the server to0.0.0.0and exposing a Prometheus metrics endpoint on port 9090. This configuration exposes the inference API and system performance data to the network, which may lead to unauthorized data access if not properly restricted.\n- [EXTERNAL_DOWNLOADS]: The skill involves the installation of several third-party libraries includingvllm,locust,autoawq,auto-gptq, andflash-attn. While these are recognized tools, they represent an external dependency chain.\n- [PROMPT_INJECTION]: The skill facilitates the serving of LLMs that process arbitrary user input, creating a surface for indirect prompt injection.\n - Ingestion points: User prompts are ingested via the OpenAI-compatible API described in
SKILL.md.\n - Boundary markers: Absent; there are no instructions for using delimiters or warnings to ignore instructions embedded in data.\n
- Capability inventory: The vLLM server has network access and can perform code execution if the trust flag is enabled.\n
- Sanitization: Absent; no methods for input filtering or validation are documented.
Audit Metadata