llm-safety-patterns

Pass

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: LOW
Full Analysis
  • Security Analysis (SAFE): A comprehensive audit of all files against the 10 threat categories confirms that the skill is purely defensive and educational in nature.
  • Category 1 (Prompt Injection): No injection, bypass, or override patterns were found. The skill contains instructions and code (e.g., audit_prompt in scripts/prompt_builder.py) specifically designed to detect and block injection attempts.
  • Category 2 (Data Exposure & Exfiltration): No hardcoded credentials or sensitive data exfiltration patterns were detected. The skill implements regular expressions to identify and redact UUIDs and sensitive identifiers from prompts.
  • Category 4 (Unverifiable Dependencies & RCE): No remote code execution patterns, piped shell scripts, or untrusted package downloads were identified. The dependency list is limited to standard Python components and Pydantic.
  • Category 8 (Indirect Prompt Injection): The skill identifies the processing of untrusted external content as a risk and provides a 'SafePromptBuilder' to mitigate this surface. It uses explicit boundary markers (e.g., SYSTEM, USER QUERY, CONTEXT) and automated sanitization to ensure that document content does not compromise the agent's instructions.
  • Category 10 (Dynamic Execution): No dangerous use of eval(), exec(), or runtime compilation was found. The code uses standard templating and parameterized database queries.
Audit Metadata
Risk Level
LOW
Analyzed
Feb 16, 2026, 12:20 AM