discord-voice
Pass
Audited by Gen Agent Trust Hub on Feb 18, 2026
Risk Level: SAFEREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- Dynamic Execution (LOW): The skill dynamically loads the host's extension API using
import()on computed paths. - Evidence:
src/core-bridge.tsresolves the OpenClaw root directory and importsdist/extensionAPI.jsat runtime.scripts/smoke-test.mjsperforms a similar operation. - Mitigation: The skill performs integrity checks by reading the target directory's
package.jsonand verifying the name is 'openclaw' before importing. Paths are normalized usingpath.resolve()to prevent traversal. - Context: This is a required architectural pattern for OpenClaw plugins; severity is downgraded as it is central to the skill's primary purpose.
- Indirect Prompt Injection (LOW): The skill ingests untrusted audio data from Discord users which is then transcribed and processed by an AI agent.
- Ingestion points: Voice audio is captured in real-time and converted to text via
src/streaming-tts.tsorsrc/stt.ts(not fully provided, but referenced). - Boundary markers: The skill uses an
extraSystemPromptto instruct the agent on response constraints, but explicit delimiters for the transcribed user text are not visible in the bridge logic. - Capability inventory: The agent can use the
discord_voicetool to join/leave channels and play audio. - Sanitization: Discord
userIdis validated against a snowflake regex, and administrative configuration is used for sensitive parameters. - External Downloads (LOW): The skill makes outbound network requests to third-party STT/TTS providers.
- Evidence:
src/streaming-tts.tssends audio data and API keys toapi.openai.comandapi.elevenlabs.io. - Context: These are standard operations for a voice-enabled AI skill.
- Command Execution (LOW): The skill relies on system-level tools like
ffmpegfor audio transcoding. - Evidence:
SKILL.mdandclawdbot.plugin.jsondeclareffmpegas a system dependency, which is typically invoked via libraries likeprism-mediafor audio stream processing.
Audit Metadata