serving-llms-vllm
Warn
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- REMOTE_CODE_EXECUTION (MEDIUM): The skill repeatedly suggests using the
--trust-remote-codeflag (found inSKILL.mdandreferences/troubleshooting.md) when loading custom or gated models. This flag explicitly permits the execution of arbitrary Python code defined in the model's configuration or repository files on the host machine. - COMMAND_EXECUTION (LOW): The skill includes numerous shell commands for package management (
pip install), container orchestration (docker run), and system firewall modification (sudo ufw allow 8000). While typical for DevOps tasks, these commands require administrative oversight. - DATA_EXPOSURE (LOW): Deployment patterns in
references/server-deployment.mdandSKILL.mdsuggest binding the inference server to0.0.0.0with a defaultapi_key='EMPTY'. Without additional network security (like a reverse proxy or VPC), this exposes the model's capabilities and potential local data processed by the model to the public internet. - INDIRECT_PROMPT_INJECTION (LOW): The skill documents workflows that ingest untrusted user prompts into an LLM engine (Category 8).
- Ingestion points:
prompts.txtand OpenAI-compatible API endpoints. - Boundary markers: Missing; the examples interpolate raw text into
llm.generate()calls. - Capability inventory: Subprocess calls via
vllm serveand Python execution. - Sanitization: None provided in the examples.
Audit Metadata