Speak With Profile

Use this skill as the required front door for speech tasks. Treat built-in $speech as the synthesis engine, and keep this skill as the policy and profile adapter.

Purpose

Normalize speech requests with deterministic profile precedence.
Enforce disclosure and manifest policy consistently.
Delegate generation to $speech by default in Codex App/CLI conversations.
Keep a script fallback for deterministic local/automation execution.

Required UX pattern

Route speech requests through $speak-with-profile first.
Resolve profile/default/override values.
Require disclosure text in end-user output context.
Delegate generation to built-in $speech using resolved fields.
Record effective configuration in a run manifest.

Execution modes

delegate (default for conversational usage): use built-in $speech for synthesis.
local-cli (fallback for scripted/automation runs): invoke scripts/text_to_speech.py via scripts/speak_with_profile.py.

Use delegate unless there is a concrete reason to run the local script path.

Inputs needed

Input source: either the exact text to speak or a path to a text file.
Profile choice: a --profile ID if the caller wants a named profile, or no profile if baseline defaults are acceptable.
Profile data source: a --profiles-file path when profile-based resolution is required.
Delivery guidance: any explicit voice, instructions, speed, or format overrides the caller wants to force.
Output intent: target output path or filename expectations for the generated audio.
Execution preference: whether built-in $speech is acceptable or deterministic local-cli behavior is required.
Playback intent: whether the generated file should also be played locally with open or afplay.
Optional skill-local workflow customization from config/customization.yaml or config/customization.template.yaml.

Hard constraints

Do not bypass this skill for speech tasks in this repository workflow.
Do not modify built-in $speech behavior assumptions; adapt inputs/outputs around it.
Never modify the bundled generation script scripts/text_to_speech.py.
Use built-in voices only.
Require OPENAI_API_KEY for live API calls.

Disclosure policy

Always include a clear disclosure in user-visible output when speech is produced, for example:

"This audio was generated by AI text-to-speech."

If a profile provides disclosure, use it. Otherwise, use the default disclosure above.

Profile resolution order

Explicit wrapper flags (--voice, --speed, --instructions, --format, output path flags).
Selected profile (--profile).
Default profile from profiles file (default_profile).
Baseline defaults (voice=cedar, speed=1.0, format mp3, model gpt-4o-mini-tts-2025-12-15).

Workflow configuration precedence

Explicit user input and wrapper flags.
Skill-local workflow customization in config/customization.yaml, when present.
Skill-local defaults in config/customization.template.yaml.
Workflow defaults described in this skill and references/wrapper-contract.md.

The local wrapper loads supported customization values from config/customization.yaml and config/customization.template.yaml at runtime. Use skill-local customization to guide wrapper defaults and agent decisions. preferredExecutionMode remains documentation-only because the wrapper script implements only local-cli.

Workflow

Resolve input text source (--text or --text-file).
Resolve effective workflow settings using explicit input, skill-local customization, then workflow defaults.
Resolve profile and defaults using references/wrapper-contract.md.
Validate configuration against references/profile-schema.md.
Choose execution mode:
- delegate: call built-in $speech with resolved fields.
- local-cli: run scripts/speak_with_profile.py.
Emit/validate run manifest and include disclosure.

Output Contract

delegate: return a user-visible result that confirms speech generation intent, includes the disclosure text, and surfaces the resolved speech settings that matter for the request.
local-cli: produce an audio file at the requested or resolved output path and write an adjacent <audio>.manifest.json file.
local-cli manifest/result reporting must include:
- output audio path
- manifest path
- disclosure text
- execution mode (local-cli)
- playback result when open or afplay is used
Playback failures must remain explicit rather than being treated as silent success.

Failure modes

Missing OPENAI_API_KEY: stop before generation and tell the caller that the local environment must provide the API key.
Missing text source or unreadable text file: fail fast rather than guessing input content.
Missing profiles file when --profile is requested: fail fast and require the caller to provide a valid JSON or YAML profiles file.
Invalid profiles file: stop on schema/parse errors and surface the validation failure clearly.
Unknown profile ID: fail with the known profile list from the supplied profiles file.
Input over 4096 characters: fail fast and require the caller to split or chunk the input before running.
Missing bundled scripts/text_to_speech.py on the local CLI path: stop and report that the expected bundled CLI is unavailable.
Downstream speech CLI failure: propagate the subprocess failure rather than masking it.
Playback failure with open or afplay: return the deterministic playback failure outcome, record it in the manifest, and respect --stop-on-error / --no-stop-on-error.

References

Profile schema and validation rules: references/profile-schema.md
Starter profile set and examples: references/starter-profiles.md
Adapter contract and mode behavior: references/wrapper-contract.md
Skill-local workflow customization: references/customization.md

Validation helper

Use scripts/validate_manifest.py to verify required manifest keys:

uv run python speak-with-profile/scripts/validate_manifest.py path/to/output/file.manifest.json