voice-to-report

Pass

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection through voice transcripts. If a user provides an audio recording containing spoken instructions to disregard formatting or leak information, the LLM may follow them.
  • Ingestion points: The voice_to_report and transcribe_multilingual functions in SKILL.md ingest untrusted audio data, which is converted to text and placed directly into the LLM context.
  • Boundary markers: The prompt uses a simple label (Transcript:\n{transcript.text}) without strong delimiters or specific instructions to ignore embedded commands within the transcript.
  • Capability inventory: The skill has filesystem (read access for audio files) and network (access to OpenAI API and potential external PDF services) permissions.
  • Sanitization: There is no evidence of transcript filtering, keyword checking, or input validation to detect or neutralize malicious spoken commands before processing.
  • [EXTERNAL_DOWNLOADS]: The skill requires several standard Python packages for operation.
  • Evidence: The skill documentation specifies the installation of openai, whisper (OpenAI's official speech-to-text library), and python-telegram-bot via pip.
  • Evidence: The code demonstrates the use of whisper.load_model("base"), which downloads pre-trained model weights from OpenAI's official repositories upon first execution.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 5, 2026, 04:28 AM