speech-to-text
Pass
Audited by Gen Agent Trust Hub on Apr 16, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill references installation instructions and related skills hosted on the vendor's GitHub organization (inference-sh/skills). This is a legitimate distribution method for the vendor's tooling and extensions.
- [COMMAND_EXECUTION]: The skill instructs the agent to utilize the
infshCLI tool. Access to this tool is restricted in the YAML frontmatter through theallowed-toolsfield, which implements a security policy limiting execution to specific vendor commands. - [DATA_EXFILTRATION]: Audio and video URLs provided by the user are transmitted to the inference.sh service for processing. This data transmission is the intended core functionality of the transcription service and is performed using the vendor's authenticated CLI.
- [PROMPT_INJECTION]: The skill processes untrusted external data in the form of audio and video URLs. This presents a potential surface for indirect prompt injection if the processed media contains malicious instructions.
- Ingestion points:
audio_urlandvideo_urlparameters in theinfshcommand examples withinSKILL.md. - Boundary markers: Absent in the example command templates.
- Capability inventory: The skill is restricted to the
infshtool viaallowed-toolsinSKILL.md. - Sanitization: Not explicitly defined in the instructions; the skill relies on the underlying tool's handling of input strings.
Audit Metadata