phone-agent
SKILL.md
Phone Agent Skill
Runs a local FastAPI server that acts as a real-time voice bridge.
Architecture
Twilio (Phone) <--> WebSocket (Audio) <--> [Local Server] <--> Deepgram (STT)
|
+--> OpenAI (LLM)
+--> OpenAI TTS or ElevenLabs (TTS)
Prerequisites
- Twilio Account: Phone number + TwiML App.
- Deepgram API Key: For fast speech-to-text.
- OpenAI API Key: For conversation logic + TTS (default).
- ElevenLabs API Key (optional): For higher-quality TTS (set
TTS_PROVIDER=elevenlabs). - Ngrok (or similar): To expose your local port 8080 to Twilio.
Setup
-
Install Dependencies:
pip install -r scripts/requirements.txt -
Set Environment Variables (in
~/.moltbot/.env,~/.clawdbot/.env, or export):export DEEPGRAM_API_KEY="your_key" export OPENAI_API_KEY="your_key" export TWILIO_ACCOUNT_SID="your_sid" export TWILIO_AUTH_TOKEN="your_token" export PORT=8080 # TTS Provider (default: openai — ~6x cheaper than ElevenLabs) export TTS_PROVIDER="openai" # or "elevenlabs" export OPENAI_TTS_VOICE="echo" # alloy, echo, fable, onyx, nova, shimmer export OPENAI_TTS_MODEL="tts-1" # tts-1 (fast) or tts-1-hd (quality) # Only needed if TTS_PROVIDER=elevenlabs export ELEVENLABS_API_KEY="your_key" export ELEVENLABS_VOICE_ID="onwK4e9ZLuTAKqWW03F9"Optional - System Prompt Customization (priority: file > env var > built-in):
# Option 1: Load from file export SYSTEM_PROMPT_FILE="/path/to/custom-prompt.txt" # Option 2: Set directly via env var export SYSTEM_PROMPT="You are a helpful phone assistant. Be concise and friendly." # Option 3: Use built-in defaults with name customization export AGENT_NAME="Niemand" export OWNER_NAME="Martin's" -
Start the Server:
python3 scripts/server.py -
Expose to Internet:
ngrok http 8080 -
Configure Twilio:
- Go to your Phone Number settings.
- Set "Voice & Fax" -> "A Call Comes In" to Webhook.
- URL:
https://<your-ngrok-url>.ngrok.io/incoming - Method:
POST
Usage
Call your Twilio number. The agent should answer, transcribe your speech, think, and reply in a natural voice.
Customization
- System Prompt: Configure via
SYSTEM_PROMPT_FILE(load from file),SYSTEM_PROMPT(env var), or modify the built-in defaults withAGENT_NAMEandOWNER_NAME. - TTS Provider: Set
TTS_PROVIDER=openai(default, $0.03/min) orTTS_PROVIDER=elevenlabs($0.17/min, higher quality). - Voice (OpenAI): Set
OPENAI_TTS_VOICE— options: alloy, echo, fable, onyx, nova, shimmer. - Voice (ElevenLabs): Change
ELEVENLABS_VOICE_IDto use different voices. - Model: Switch
gpt-4o-minitogpt-4for smarter (but slower) responses. - Language: Set
AGENT_LANGUAGEtoenordefor English or German.
Weekly Installs
1
Repository
kesslerio/phone…ot-skillGitHub Stars
3
First Seen
8 days ago
Security Audits
Installed on
amp1
cline1
openclaw1
opencode1
cursor1
kimi-cli1