model-inversion

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [DATA_EXPOSURE] (SAFE): The skill contains logic for detecting sensitive information such as PII and API keys within model outputs. This is the primary function of the tool for security auditing and does not involve unauthorized data access or exfiltration.
  • [REMOTE_CODE_EXECUTION] (SAFE): No remote code execution patterns or untrusted script downloads were detected. The Python code provided consists of local testing logic.
  • [COMMAND_EXECUTION] (SAFE): The provided scripts do not execute arbitrary shell commands or interact with the operating system in a privileged or dangerous manner.
  • [INDIRECT_PROMPT_INJECTION] (LOW): The skill processes model responses to detect data leakage. While this is a surface for indirect injection from the target model's output, the scripts prioritize refusal detection and do not perform unsafe operations on the resulting data.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:38 PM