skills/jd-opensource/joysafeter/pentest-ai-llm-security

pentest-ai-llm-security

SKILL.md

Pentest AI/LLM Security

Purpose

AI-integrated applications introduce entirely new attack surfaces. Prompt injection is the "SQLi of AI." Neither Shannon nor any existing skill addresses this domain. OWASP LLM Top 10 (2025) defines the methodology.

Prerequisites

Authorization Requirements

  • Written authorization with AI/LLM testing scope explicitly included
  • Model access details — API endpoints, model versions, tool/function access
  • Data sensitivity classification — what data the LLM can access
  • Rate limit awareness — LLM API costs can escalate quickly

Environment Setup

  • Garak for automated LLM vulnerability scanning
  • Burp Suite for API interception of LLM requests/responses
  • Python scripts for custom prompt injection payloads
  • Local proxy to capture full request/response chains

Core Workflow

  1. Integration Point Discovery: Identify all LLM integration points — chat interfaces, content generation, RAG pipelines, AI search, code completion, summarization.
  2. Direct Prompt Injection: Override system prompts, extract system prompt content, inject instructions that change model behavior.
  3. Indirect Prompt Injection: Embed malicious instructions in documents/emails/web pages the LLM processes, poisoned RAG context.
  4. Data Exfiltration: Extract training data, PII from context windows, other users' conversation history, system config details.
  5. Insecure Output Handling: LLM output rendered as HTML (XSS via LLM), used in SQL queries (SQLi via LLM), used in system commands.
  6. Excessive Agency: LLM with tool access performing unauthorized actions, privilege escalation through tool chains, resource abuse.
  7. Classification: Document findings with OWASP LLM Top 10 (2025) classification and remediation guidance.

OWASP LLM Top 10 (2025) Coverage

Category Test Focus Status
LLM01 Prompt Injection Direct and indirect injection
LLM02 Sensitive Information Disclosure Data exfiltration, PII leakage
LLM03 Supply Chain Model provenance, plugin trust
LLM04 Data and Model Poisoning Training data integrity
LLM05 Improper Output Handling XSS/SQLi via LLM output
LLM06 Excessive Agency Unauthorized tool use
LLM07 System Prompt Leakage System prompt extraction
LLM08 Vector and Embedding Weaknesses RAG poisoning
LLM09 Misinformation Hallucination exploitation
LLM10 Unbounded Consumption Resource exhaustion

Tool Categories

Category Tools Purpose
LLM Scanning Garak, rebuff Automated prompt injection testing
API Interception Burp Suite, mitmproxy LLM API request/response capture
Prompt Fuzzing Custom Python scripts Payload generation and testing
Output Analysis Browser DevTools, Burp Insecure output rendering detection

References

  • references/tools.md - Tool function signatures and parameters
  • references/workflows.md - Attack pattern definitions and test vectors
Weekly Installs
30
GitHub Stars
175
First Seen
Feb 18, 2026
Installed on
github-copilot30
codex30
kimi-cli30
amp30
gemini-cli30
cursor30