rag-exploitation
Warn
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: MEDIUMPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [Prompt Injection] (MEDIUM): 'SKILL.md' contains multiple injection templates designed to override agent behavior, such as 'IMPORTANT SYSTEM NOTE' and 'ADMIN_OVERRIDE'. These are intended to test if RAG systems correctly handle untrusted context. (Severity: HIGH, downgraded due to primary security-testing purpose).
- [Obfuscation] (MEDIUM): The 'ContextInjectionAttack' class in 'SKILL.md' includes a 'hide_with_zero_width' method that uses Unicode characters (U+200B, U+200C) to conceal malicious instructions within legitimate document content. (Severity: HIGH, downgraded due to primary security-testing purpose).
- [Indirect Prompt Injection] (LOW): The skill documents and exploits vulnerability surfaces where untrusted data enters the agent context via RAG retrieval. Evidence: 1. Ingestion points: 'assets/rag-attacks.yaml' and 'scripts/test-rag-security.py'. 2. Boundary markers: Absent in the provided attack payloads. 3. Capability inventory: 'scripts/test-rag-security.py' (file-write) and 'SKILL.md' (retrieval access). 4. Sanitization: Discussed in 'references/RAG-VULNERABILITIES.md' but not implemented in the active attack logic.
- [Data Exposure & Exfiltration] (LOW): The testing framework in 'scripts/test-rag-security.py' includes specific test cases (e.g., RAG-005) designed to attempt the extraction of sensitive or confidential documents from the knowledge base.
Audit Metadata