surprise-me
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHPROMPT_INJECTION
Full Analysis
- [Prompt Injection] (HIGH): Vulnerable to Indirect Prompt Injection via dynamic orchestration.
- Ingestion points: The skill dynamically reads the contents of
<available_skills>and the fullSKILL.mdbody of any discovered skill. - Boundary markers: Absent. The skill does not use delimiters or instructions to ignore potential malicious prompts embedded within the discovered skills.
- Capability inventory: Step 4 ('Execute') explicitly directs the agent to 'Follow each skill's instructions for technical execution,' which can include arbitrary command execution, file modifications, or network requests depending on the available skills.
- Sanitization: Absent. There is no validation or filtering of the instructions found in the external
SKILL.mdfiles before execution. - [Prompt Injection] (MEDIUM): The 'Reveal' phase (Step 5) mandates 'minimal preamble' and explicitly discourages explaining which skills were combined ('Bad reveal: I decided to combine the pptx skill...'). This lack of transparency prevents the user from performing a 'human-in-the-loop' check to identify if a malicious skill has been unexpectedly triggered.
- [Command Execution] (LOW): Step 3 ('Interactive HTML experience') involves dynamic code generation. While common for creative tasks, when combined with untrusted data (like the news fallback), it increases the risk of generating or executing malicious scripts within the user's session.
Recommendations
- AI detected serious security threats
Audit Metadata