The Agent Skills Directory

[COMMAND_EXECUTION]: The script scripts/transcribe.js uses child_process.execSync to run ffmpeg for audio extraction from video files. The file path argument is interpolated into the shell command string using ${filePath} inside double quotes. This construct is vulnerable to command injection because shells typically evaluate command substitutions (like $(command)) and backticks even within double quotes. An attacker providing a malicious filename could execute arbitrary shell commands with the same privileges as the agent. \n- [DATA_EXFILTRATION]: The skill's core functionality involves reading local files (node:fs/promises.readFile) and sending their contents to an external API (Soniox). While expected for a speech-to-text tool, this pattern can be abused to exfiltrate sensitive files if the agent is tricked into transcribing non-media files such as .ssh/id_rsa or .env. \n- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection. \n
Ingestion points: Data from external audio/video files is processed and returned as text in scripts/transcribe.js. \n
Boundary markers: Absent. The transcription result is displayed or saved without delimiters or warnings to ignore embedded instructions. \n
Capability inventory: Subprocess execution (execSync in scripts/transcribe.js), file system access (readFile, writeFile, unlink), and network communication with the Soniox API. \n
Sanitization: Absent. The text received from the transcription API is used directly without filtering or escaping.

speech-to-text