skills/chi111i/ctf-skills/ctf-ai-ml/Gen Agent Trust Hub

ctf-ai-ml

Pass

Audited by Gen Agent Trust Hub on Apr 5, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill documentation includes numerous examples of direct and indirect prompt injection payloads.
  • Evidence: Payloads like 'Ignore all previous instructions', 'DAN' jailbreak templates, and token smuggling techniques are documented in llm-attacks.md and SKILL.md.
  • Context: These are explicitly labeled as attack vectors for the user to study or use against CTF targets.
  • [PROMPT_INJECTION]: Documentation reveals an attack surface for indirect prompt injection where the agent might ingest untrusted data.
  • Ingestion points: llm-attacks.md describes scenarios where the agent processes external web pages or retrieved documents (e.g., via WebFetch).
  • Boundary markers: The provided exploit examples demonstrate the absence of boundary markers or instructions to ignore embedded data in target systems.
  • Capability inventory: The skill environment provides Bash, Write, and WebFetch capabilities (SKILL.md).
  • Sanitization: No sanitization or validation of the ingested untrusted content is present in the documented exploitation patterns.
  • [COMMAND_EXECUTION]: The skill provides various Python snippets for model analysis and weight manipulation, including the use of torch.load and safetensors in SKILL.md and model-attacks.md.
  • Potential Risk: torch.load performs deserialization which can execute arbitrary code if used on a malicious model file provided by a challenge author.
  • [DATA_EXFILTRATION]: The skill contains functional code for interacting with remote endpoints, which is used to send attack payloads and retrieve results.
  • Evidence: Scripts in llm-attacks.md and model-attacks.md use the requests library and curl to communicate with targets like http://challenge:8080/api/chat.
  • [EXTERNAL_DOWNLOADS]: The skill performs installations of common third-party machine learning and security libraries.
  • Evidence: pip install commands in SKILL.md target well-known packages including torch, transformers, scikit-learn, and peft.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 5, 2026, 02:12 PM