ai-threat-testing
Audited by Socket on Feb 16, 2026
1 alert found:
Malware[Skill Scanner] Detected attempt to override previous instructions This document describes a legitimate offensive AI/LLM pentesting framework with comprehensive capabilities. The fragment itself contains no embedded malware or obfuscated code, but it solicits and processes high-sensitivity inputs and automates powerful exploit techniques. Primary risks: (1) misuse by unauthorized actors or insiders, (2) insecure handling/transmission of captured secrets and PII, (3) potential for unintended DoS or cost impact if tests run without safeguards, and (4) supply-chain risk for agent binaries and reporting endpoints. Recommend strict operational controls before deployment: enforce explicit authorization/attestation, least-privilege credential use, encrypted & access-controlled evidence storage, sandboxed execution with throttling, audit logging, provenance and integrity checks for agent code/dependencies, and clear retention/redaction policies for captured data. LLM verification: The document describes a legitimate offensive AI pentesting skillset and workflows, but it contains multiple high-risk operational behaviors (log deletion, unauthorized API calls, privilege escalation) and a prompt-injection pattern flagged by static analysis that can enable safety bypass. There is no evidence of direct malware or obfuscation in the provided text; however, lack of provenance, missing authorization and data-handling controls, and the described destructive capabilities make this m