The Agent Skills Directory

[PROMPT_INJECTION]: The skill documentation includes numerous examples of direct and indirect prompt injection payloads.
Evidence: Payloads like 'Ignore all previous instructions', 'DAN' jailbreak templates, and token smuggling techniques are documented in llm-attacks.md and SKILL.md.
Context: These are explicitly labeled as attack vectors for the user to study or use against CTF targets.
[PROMPT_INJECTION]: Documentation reveals an attack surface for indirect prompt injection where the agent might ingest untrusted data.
Ingestion points: llm-attacks.md describes scenarios where the agent processes external web pages or retrieved documents (e.g., via WebFetch).
Boundary markers: The provided exploit examples demonstrate the absence of boundary markers or instructions to ignore embedded data in target systems.
Capability inventory: The skill environment provides Bash, Write, and WebFetch capabilities (SKILL.md).
Sanitization: No sanitization or validation of the ingested untrusted content is present in the documented exploitation patterns.
[COMMAND_EXECUTION]: The skill provides various Python snippets for model analysis and weight manipulation, including the use of torch.load and safetensors in SKILL.md and model-attacks.md.
Potential Risk: torch.load performs deserialization which can execute arbitrary code if used on a malicious model file provided by a challenge author.
[DATA_EXFILTRATION]: The skill contains functional code for interacting with remote endpoints, which is used to send attack payloads and retrieve results.
Evidence: Scripts in llm-attacks.md and model-attacks.md use the requests library and curl to communicate with targets like http://challenge:8080/api/chat.
[EXTERNAL_DOWNLOADS]: The skill performs installations of common third-party machine learning and security libraries.
Evidence: pip install commands in SKILL.md target well-known packages including torch, transformers, scikit-learn, and peft.

ctf-ai-ml