smart-delegation
Fail
Audited by Gen Agent Trust Hub on Mar 7, 2026
Risk Level: HIGHPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill contains explicit instructions and triggers designed to bypass AI safety filters. It defines an 'Unfiltered' mode triggered by phrases like 'no guardrails' and 'unfiltered'.
- [PROMPT_INJECTION]: The task templates for sub-agents explicitly command the model to ignore 'corporate safety theater' and provide 'uncensored' or 'politically incorrect' answers, which is a direct attempt to override safety protocols.
- [DATA_EXFILTRATION]: The skill provides instructions to pack context from sensitive files, such as 'USER.md' (containing personal facts) and 'SOUL.md' (containing identity data), and send them to external third-party model providers like x-ai (Grok) or OpenRouter.
- [PROMPT_INJECTION]: The delegation logic lacks proper boundary markers or sanitization when interpolating user-provided questions into the 'sessions_spawn' task string, making the sub-agent vulnerable to indirect prompt injection from the user's initial input.
Recommendations
- AI detected serious security threats
Audit Metadata