recovery-community-moderator
Pass
Audited by Gen Agent Trust Hub on Mar 5, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection through the moderation of forum content.
- Ingestion points: Untrusted content is ingested via the
moderate_contentfunction inscripts/moderate_content.py, where user-generated text is directly interpolated into the prompt string. - Boundary markers: There are no explicit boundary markers (e.g., XML tags, triple-quotes, or specific delimiters) or instructions to 'ignore embedded commands' surrounding the
{content}variable in the prompt. - Capability inventory: The skill metadata in
SKILL.mdallows the toolsRead,Write, andEdit. An attacker could embed instructions in a forum post that, when processed, trick the moderator into outputting asuggested_actionthat results in unauthorized file operations or logic bypass. - Sanitization: No sanitization, escaping, or validation is performed on the input content before it is sent to the LLM.
Audit Metadata