agentic-rag-patterns
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- Indirect Prompt Injection (HIGH): The skill's architecture creates a significant attack surface for indirect prompt injection. It retrieves data from potentially attacker-controlled sources (web search results via Tavily and vector database documents) and uses this content to influence its internal logic flow.
- Ingestion points: The
web_searchfunction inSKILL.mdandadaptive_retrieveinreferences/self-rag.mdingest untrusted content from the web and external databases. - Boundary markers: The
generatenode inSKILL.mdconcatenates document content with simple numeric indices (e.g.,[1] doc...) and newlines, lacking robust delimiters or 'ignore' instructions to prevent the model from obeying commands embedded within the documents. - Capability inventory: The skill executes network searches (
tavily_client.search) and performs 'self-correcting' logic (routing, query rewriting, and document grading) based on the ingested data. - Sanitization: No sanitization or filtering logic is present for retrieved text before it is interpolated into LLM prompts for grading or generation.
- Network Operation (LOW): The skill performs outbound network requests to
tavily.comvia thetavily_client. While this is consistent with the stated purpose (web fallback), the domain is not on the standard whitelist for trusted data destinations. - Metadata Anomalies (MEDIUM): The
SKILL.mdfile references a future version of LangGraph (1.0.6, Jan 2026), which suggests the instructions may be generated or include hallucinated metadata intended to simulate authority or specific environment requirements.
Recommendations
- AI detected serious security threats
Audit Metadata