agent-safety
Pass
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: LOW
Full Analysis
- General Security (SAFE): The skill is primarily instructional and contains no malicious logic. The validation script uses
yaml.safe_load()which prevents remote code execution via YAML deserialization. - Category 8: Indirect Prompt Injection (LOW): The skill possesses an attack surface as it is designed to process external untrusted data for safety filtering. * Ingestion points: Data enters via the
taskparameter and processedllm_responsestrings. * Boundary markers: None explicitly defined in the provided templates. * Capability inventory: Limited to internal monitoring and content filtering; no file-write or subprocess execution capabilities. * Sanitization: Employs theguardrailslibrary for toxicity and PII detection.
Audit Metadata