The Agent Skills Directory

Indirect Prompt Injection (LOW): The skill processes untrusted user input to classify topics and generate responses. While this represents a standard injection surface, the skill implements specific mitigation strategies including a detection function (detect_prompt_injection) and output validation to minimize risk.
Ingestion points: user_input variable in classify_topic and apply_guardrails functions.
Boundary markers: Absent in the classification prompt interpolation, but prefaced by a detection step.
Capability inventory: Calls to an internal LLM tool for classification and response generation.
Sanitization: Includes regex-based detection for common injection patterns and PII redaction logic.
Unverifiable Dependencies & Remote Code Execution (SAFE): The skill references the presidio-analyzer and presidio-anonymizer libraries. These are reputable, well-known open-source packages for PII management.
Prompt Injection (SAFE): The skill contains lists of prompt injection patterns (e.g., 'ignore previous instructions'), but these are used strictly for detection and filtering purposes, not as instructions to the agent.

guardrails-safety-filter-builder