The Agent Skills Directory

PROMPT_INJECTION (SAFE): The classify_topic function in SKILL.md interpolates user-provided queries directly into an LLM prompt template. While this is a common surface for indirect prompt injection, the risk is mitigated by the skill's primary purpose as a safety filter and its multi-layered detection approach. • Ingestion points: query parameter in classify_topic. • Boundary markers: Absent in the prompt template. • Capability inventory: The script facilitates tool calls to llm() but does not include file system, network, or shell execution capabilities. • Sanitization: The apply_guardrails pipeline implements detect_prompt_injection regex checks as a primary layer of defense before classification.

guardrails-safety-filter-builder