serving-llms-vllm

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • REMOTE_CODE_EXECUTION (MEDIUM): The skill repeatedly suggests using the --trust-remote-code flag (found in SKILL.md and references/troubleshooting.md) when loading custom or gated models. This flag explicitly permits the execution of arbitrary Python code defined in the model's configuration or repository files on the host machine.
  • COMMAND_EXECUTION (LOW): The skill includes numerous shell commands for package management (pip install), container orchestration (docker run), and system firewall modification (sudo ufw allow 8000). While typical for DevOps tasks, these commands require administrative oversight.
  • DATA_EXPOSURE (LOW): Deployment patterns in references/server-deployment.md and SKILL.md suggest binding the inference server to 0.0.0.0 with a default api_key='EMPTY'. Without additional network security (like a reverse proxy or VPC), this exposes the model's capabilities and potential local data processed by the model to the public internet.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill documents workflows that ingest untrusted user prompts into an LLM engine (Category 8).
  • Ingestion points: prompts.txt and OpenAI-compatible API endpoints.
  • Boundary markers: Missing; the examples interpolate raw text into llm.generate() calls.
  • Capability inventory: Subprocess calls via vllm serve and Python execution.
  • Sanitization: None provided in the examples.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:03 PM