cledon
SKILL.md
Cledon — Voice AI Agent Testing
Cledon tests voice AI agents by simulating callers that phone your agent and evaluate responses against assertions.
Domain Model
Agent — the voice AI being tested (name, phone number, personality)
Folder — groups related test cases
Test Case — defines assertions + expected tool calls for one agent
Scenario — a runnable test with caller instructions for one test case
Run — execution of a scenario producing transcript + pass/fail results
Relationships: Agent → many Test Cases → many Scenarios. Each Scenario produces Runs.
Available Tools (22)
Analytics
| Tool | Purpose |
|---|---|
get-overall-stats |
Dashboard summary: total scenarios, runs, pass rate, avg duration |
get-run-history |
Recent runs with pass/fail counts (1-90 days lookback) |
get-failed-assertions |
Top 10 recurring failures with up to 3 example runs each |
Agents
| Tool | Purpose |
|---|---|
list-agents |
List all voice agents |
get-agent |
Full agent details by ID |
create-agent |
Create agent in call mode (phone number) or LLM mode (ElevenLabs, Vapi, LiveKit, Famulor, Synthflow) |
update-agent |
Update agent properties |
delete-agent |
Delete agent and associated data |
Test Cases & Scenarios
| Tool | Purpose |
|---|---|
list-testcases |
List test cases (optional folderId filter) |
get-testcase |
Full test case with assertions and expected tool calls |
create-testcase |
AI-generate test case from a transcript or system prompt; supports includeScenarios to auto-create scenarios |
update-testcase |
Update test case properties |
execute-testcase |
Run all scenarios for a test case |
list-scenarios |
List scenarios (optional testCaseId filter) |
get-scenario |
Full scenario with caller instructions |
Execution
| Tool | Purpose |
|---|---|
run-scenario |
Trigger single test → returns runId |
run-multiple-scenarios |
Batch trigger → returns array of runIds |
get-run-status |
Full run details: transcript, assertions, tool call validation |
get-scenario-runs |
Run history for one scenario with pass/fail counts |
cancel-run |
Cancel a stuck run (only status=running) |
Credentials
| Tool | Purpose |
|---|---|
list-credentials |
List all stored voice platform credentials (keys never exposed) |
create-credential |
Store a new platform API key (elevenlabs, vapi, livekit, famulor, synthflow) |
update-credential |
Update a credential's name or API key |
delete-credential |
Delete a stored credential |
Workflows
Get an overview of testing status
get-overall-stats→ see pass rate, total runs, average durationget-run-historywith days=7 → see recent individual resultsget-failed-assertions→ identify systemic issues
Run a test and check results
list-scenarios→ find the scenario IDrun-scenariowith scenarioId → get back a runId- Wait a moment, then
get-run-statuswith runId → see transcript + assertion results - If status is still "running", wait and check again
Run all tests for a test case
list-scenarioswith testCaseId filter → collect all scenario IDsrun-multiple-scenarioswith the ID arrayget-run-historywith days=1 → see batch results
Investigate failures
get-failed-assertions→ find the most common failures- Pick a failure, note the example runIds
get-run-statusfor each runId → read the transcript to understand what went wrongget-scenario-runsfor that scenarioId → check if it's a regression or consistent failure
Drill into a specific test case
get-testcasewith id → see assertions and expected tool callslist-scenarioswith testCaseId → see all persona combinationsget-scenariofor each → see caller instructions
Create a new test from scratch
list-agents→ pick the agent to test (orcreate-agent)create-testcasewith agent ID and assertions; setincludeScenarios: trueto auto-generate scenariosexecute-testcase→ run all scenarios, orrun-scenario→ run a single one
Create tests from a transcript
list-agents→ pick the agent (orcreate-agent)create-testcasewithagentIdandtranscript— AI analyzes the transcript and generates assertions, icons, and expected tool calls- Optionally set
includeScenarios: trueto also generate caller scenarios execute-testcase→ run all generated scenarios
Key Patterns
- List endpoints return compact data. Use the corresponding get-by-ID tool to see full details.
run-scenariois async: it returns a runId immediately. Pollget-run-statusto see results.- All data is scoped to the authenticated user's organization. No cross-tenant access.
- Run
outcomeis either "passed" or "failed". Runstatusprogresses: running → completed/failed.
Weekly Installs
6
Repository
triceralab/cledon-skillFirst Seen
Feb 13, 2026
Security Audits
Installed on
github-copilot6
codex6
kimi-cli6
gemini-cli6
opencode6
amp6