skills/pluginagentmarketplace/custom-plugin-ai-red-teaming/llm-jailbreaking/Socket

llm-jailbreaking

Fail

Audited by Socket on Feb 16, 2026

2 alerts found:

MalwareObfuscated File

MalwareSKILL.md

HIGH

MalwareHIGH

SKILL.md

[Skill Scanner] Detected jailbreak/DAN attempt All findings: [CRITICAL] prompt_injection: Detected jailbreak/DAN attempt (PI003) [AITech 1.1] [CRITICAL] prompt_injection: Detected jailbreak/DAN attempt (PI003) [AITech 1.1] [CRITICAL] prompt_injection: Detected jailbreak/DAN attempt (PI003) [AITech 1.1] This module is an operational, dual-use LLM jailbreaking toolkit: it provides concrete payloads, evasion techniques (encoding, zero-width, fragmentation), success metrics, and an automated test harness to probe models. It lacks authorization checks, ethical constraints, and safe-testing provisions and therefore poses a material risk of misuse (extraction of disallowed outputs, system prompt leakage, or large-scale probing). Recommend treating it as high-risk: do not execute against third-party or production models without explicit written permission and governance; if used defensively, require consent, audit logging, rate limits, and environment isolation. LLM verification: This module is high-risk due to its explicit documentation and tooling for bypassing LLM safety guards. It is not 'malware' in the traditional sense, but it materially lowers the barrier to perform harmful prompt-injection and jailbreak testing against models. Recommend treating the repo as sensitive: restrict execution to isolated test environments, require authorization and audit logging, remove or redact directly usable payload examples from public branches, harden detection heuristics, and a

Confidence: 98%Severity: 90%

Obfuscated Fileassets/jailbreak-library.yaml

HIGH

Obfuscated FileHIGH

assets/jailbreak-library.yaml

This YAML is a deliberate collection of LLM jailbreak techniques and templates explicitly intended to subvert model safety and extract restricted outputs. While not executable malware, it is a high-risk orchestration asset: when consumed by prompt-generation pipelines or malicious operators, it materially increases the likelihood that models will produce harmful, disallowed, or secret-revealing content. Treat as hazardous: restrict distribution, remove from automated prompt pipelines, and apply access controls and monitoring if retained for research or defensive use.

Confidence: 98%

Audit Metadata

Analyzed At

Feb 16, 2026, 10:50 PM

Package URL

pkg:socket/skills-sh/pluginagentmarketplace%2Fcustom-plugin-ai-red-teaming%2Fllm-jailbreaking%2F@e1a4e1938d060068661bf5eb900bca31cfe8397a