Hume AI API

Analyze emotions in text, audio, and video, generate expressive text-to-speech, and build speech-to-speech conversational agents via Hume's REST API.

Official docs: https://dev.hume.ai/reference

When to Use

Use this skill when you need to:

Analyze emotions and expressions in text, audio, images, or video (batch processing)
Generate expressive text-to-speech audio with emotional control
Manage EVI (Empathic Voice Interface) configurations, prompts, and tools
Retrieve chat transcripts and events from EVI conversations

Prerequisites

Sign up at https://app.hume.ai
Navigate to the API Keys page
Copy your API key

Set environment variable:

export HUME_TOKEN="your-api-key"

Placeholders: Values in {curly-braces} like {job-id} are placeholders. Replace them with actual values when executing.

Expression Measurement (Batch)

Start Inference Job from URLs

Write to /tmp/hume_request.json:

{
  "urls": ["https://example.com/media-file.mp4"],
  "models": {
    "face": {},
    "prosody": {},
    "language": {}
  },
  "notify": true
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

Start Inference Job with Text

Write to /tmp/hume_request.json:

{
  "text": ["I am so excited about this!", "This is really disappointing."],
  "models": {
    "language": {}
  }
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

List Jobs

curl -s "https://api.hume.ai/v0/batch/jobs?limit=10" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

List Jobs by Status

curl -s "https://api.hume.ai/v0/batch/jobs?limit=10&status=COMPLETED" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Get Job Details

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Get Job Predictions

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/predictions" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Download Job Artifacts

curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/artifacts" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" --output /tmp/hume_artifacts.zip

Text-to-Speech

Synthesize Speech (JSON Response)

Write to /tmp/hume_request.json:

{
  "utterances": [
    {
      "text": "Hello, how are you today?",
      "description": "A warm and friendly greeting"
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

The response contains base64-encoded audio in .generations[].audio.

Synthesize Speech with Specific Voice

Write to /tmp/hume_request.json:

{
  "utterances": [
    {
      "text": "Welcome to our platform!",
      "speed": 1.0
    }
  ],
  "voice": {
    "name": "{voice-name}"
  },
  "format": {
    "type": "mp3"
  }
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

Synthesize and Save Audio File

Write to /tmp/hume_request.json:

{
  "utterances": [
    {
      "text": "This is a test of text-to-speech synthesis."
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then extract and decode the audio:

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq -r '.generations[0].audio' | base64 -d > /tmp/hume_output.mp3

Synthesize Multiple Utterances

Write to /tmp/hume_request.json:

{
  "utterances": [
    {
      "text": "First sentence.",
      "description": "calm and measured"
    },
    {
      "text": "Second sentence!",
      "description": "excited and energetic",
      "trailing_silence": 0.5
    }
  ],
  "format": {
    "type": "mp3"
  }
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Configs

List Configs

curl -s "https://api.hume.ai/v0/evi/configs?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Create Config

Write to /tmp/hume_request.json:

{
  "name": "My EVI Config"
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/evi/configs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Prompts

List Prompts

curl -s "https://api.hume.ai/v0/evi/prompts?page_size=20" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Create Prompt

Write to /tmp/hume_request.json:

{
  "name": "Customer Support Agent",
  "text": "You are a friendly and empathetic customer support agent. Listen carefully and help resolve issues."
}

Then run:

curl -s -X POST "https://api.hume.ai/v0/evi/prompts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" -d @/tmp/hume_request.json | jq .

EVI Chats

List Chat Events

curl -s "https://api.hume.ai/v0/evi/chats/{chat-id}?page_size=50&ascending_order=true" --header "X-Hume-Api-Key: $(printenv HUME_TOKEN)" | jq .

Available Models for Expression Measurement

Model	Description
`face`	Facial expression analysis from images/video
`prosody`	Vocal expression analysis from audio
`language`	Emotion analysis from text
`ner`	Named entity recognition from text
`burst`	Non-speech vocal sounds (laughter, sighs)
`facemesh`	Detailed facial landmark detection

TTS Utterance Parameters

Parameter	Type	Description
`text`	string	Content to convert to speech (required)
`description`	string	Acting directions or voice style prompt
`speed`	number	Speech rate multiplier (default: 1.0)
`trailing_silence`	number	Silence after utterance in seconds (default: 0)

Response Codes

Status	Description
`200`	Success
`201`	Resource created
`400`	Invalid request parameters
`401`	Missing or invalid API key
`404`	Resource not found
`429`	Rate limit exceeded
`5xx`	Server error

Guidelines

Authentication: Use X-Hume-Api-Key header (not Bearer token) for all requests
Batch Jobs: Jobs are asynchronous; poll the job details endpoint until status is COMPLETED
Models: Specify only the models you need in batch requests to reduce processing time
TTS Description: Use the description field to control emotional tone and style of generated speech
Pagination: EVI endpoints use zero-based page numbering with configurable page size (1-100)
API Key Types: Organization API Key for TTS and EVI; Personal API Key for Expression Measurement

API Reference

Documentation: https://dev.hume.ai/reference
Portal: https://app.hume.ai
API Keys: https://app.hume.ai/keys