ai-threat-testing
Pass
Audited by Gen Agent Trust Hub on Apr 20, 2026
Risk Level: SAFE
Full Analysis
- [PROMPT_INJECTION]: The skill documentation includes adversarial strings like 'Ignore all previous instructions' and 'Disregard safety guidelines' within the file
reference/llm01-prompt-injection.md. These are listed as example payloads for testing target models and are contextually separated from the agent's operational instructions. - [REMOTE_CODE_EXECUTION]: No evidence of automated remote code execution or suspicious dependency management was found. While the skill methodology mentions running security tools (e.g.,
npm audit), these are described as manual steps for the tester. - [DATA_EXFILTRATION]: No patterns for harvesting or exfiltrating sensitive data (such as environment variables or credentials) were detected. The skill's focus on data exposure is limited to identifying vulnerabilities in external target systems.
- [PROMPT_INJECTION]: The skill processes untrusted data from target applications (Indirect Prompt Injection surface). While the methodology involves analyzing target responses, there are no instructions that exploit this surface to harm the agent.
Audit Metadata