moai-core-personas
Pass
Audited by Gen Agent Trust Hub on Mar 2, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill implements a persona adaptation system that can be influenced by specific user-provided keywords. A significant surface for indirect prompt injection exists where the agent may grant 'Auto-approve' status to users who trigger the 'Expert' persona using keywords like 'just do it' or 'no fluff'.
- Ingestion points: User input (user_request) is analyzed to extract expertise signals and triggers for persona switching as documented in SKILL.md and reference.md.
- Boundary markers: The documentation and pseudo-code do not define delimiters or specific instructions to separate untrusted user data from the agent's internal persona-selection logic.
- Capability inventory: The agent utilizes tools such as TodoWrite and Read. The 'Auto-approve' logic for 'Expert' personas could lead to these tools being executed without user confirmation if the system is manipulated.
- Sanitization: The expertise detection algorithm lacks validation or sanitization, allowing an adversary to artificially inflate an 'expert_score' through simple keyword injection to gain higher privileges (auto-approval).
Audit Metadata