The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Fetches source code and updates from the vendor's official repository at github.com/qwibitai/nanoclaw-whatsapp.git to implement the transcription module.
[COMMAND_EXECUTION]: Utilizes standard build and test tools (npm, npx vitest) to integrate the new functionality and verify the WhatsApp channel handler.
[CREDENTIALS_UNSAFE]: Guides the user to configure an OpenAI API key within a local .env file for authenticating with the transcription service.
[DATA_EXFILTRATION]: Transmits audio data to OpenAI's Whisper API; this is the primary intended behavior of the skill for converting voice messages to text.
[PROMPT_INJECTION]: The skill processes untrusted audio content from WhatsApp users, which could contain malicious instructions.
Ingestion points: src/channels/whatsapp.ts via the transcribeAudioMessage call.
Boundary markers: Transcription results are wrapped in [Voice: <transcript>] delimiters to distinguish them from direct text messages.
Capability inventory: The skill enables the agent to read and respond to voice message content.
Sanitization: No specific text sanitization is applied to the transcript before it is delivered to the agent context.

add-voice-transcription