The Agent Skills Directory

Prompt Injection (LOW): The script scripts/view_image.py accepts a user-provided prompt that is passed directly to the vision-language model. This could be used to bypass intent filters if the model is not properly constrained.
Indirect Prompt Injection (LOW): The skill processes external image files which may contain adversarial text intended to override agent instructions.
Ingestion points: image_path argument in scripts/view_image.py.
Boundary markers: Absent; prompts are formatted via apply_chat_template without specific delimiters for untrusted content.
Capability inventory: Limited to printing text to standard output. No file-write, network-egress, or command-execution capabilities are present in the script after model inference.
Sanitization: None; the model processes raw visual data from the image.
External Downloads (LOW): The skill downloads approximately 4GB of model weights from Hugging Face (HuggingFaceTB/SmolVLM-Instruct) during its first execution. This is the intended behavior for local model operation but constitutes a significant external dependency.

smolvlm