self-reflection
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- PROMPT_INJECTION (HIGH): The skill is vulnerable to persistent indirect prompt injection that can permanently alter agent behavior. * Ingestion points: The agent reviews conversation history and incorporates user corrections during the 'Human-in-the-loop' phase (File: SKILL.md). * Boundary markers: Absent; the reflection template uses standard markdown headers and bullets without specific delimiters to isolate lessons from system instructions. * Capability inventory: Directs the agent to create or update persistent files such as 'memories/lessons-learned.md' (File: SKILL.md). * Sanitization: No instructions exist to sanitize, escape, or validate the drafted lessons against safety protocols before they are committed to storage.
- COMMAND_EXECUTION (LOW): The workflow encourages the agent to 'run commands' and use shell utilities like 'test -f' for verification (File: SKILL.md), which reinforces the use of powerful system tools based on potentially untrusted context generated during task failures.
Recommendations
- AI detected serious security threats
Audit Metadata