ai-engineering
Pass
Audited by Gen Agent Trust Hub on Feb 20, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- Indirect Prompt Injection (LOW): Multiple modules describe patterns where untrusted external data is interpolated into LLM prompts without robust sanitization.
- Ingestion points: Data enters the context via user feedback strings in
user-feedback/SKILL.md, retrieved documents inrag-systems/SKILL.md, and tool execution results inai-agents/SKILL.md. - Boundary markers: While the
ai-agents/SKILL.mdmodule uses anObservation:marker in the ReAct loop, most other modules lack explicit delimiters (e.g., XML tags or JSON schemas) to isolate untrusted content from instructions. - Capability inventory: The patterns described include high-privilege capabilities such as tool execution in
ai-agents/SKILL.md, and automated finetuning/deployment logic inuser-feedback/SKILL.md. - Sanitization: A regex-based
sanitize_inputfunction is demonstrated inprompt-engineering/SKILL.mdfor defense, but the skill does not consistently apply this or other sanitization methods to the data ingestion examples in other modules. - Prompt Injection (SAFE): The
prompt-engineering/SKILL.mdandguardrails-safety/SKILL.mdfiles contain explicit prompt injection strings (e.g., 'ignore previous instructions'). These are used exclusively for educational purposes to demonstrate detection and defensive prompting techniques.
Audit Metadata