maui-speech-to-text
Speech-to-Text — Gotchas & Best Practices
For full service implementation, types, and UI integration patterns, see references/speech-to-text-api.md.
Critical: Permission Handling
Always request permissions before starting speech recognition. Both microphone and speech permissions are required.
// ❌ Starting recognition without checking permissions
var result = await _speechService.StartListeningAsync();
// ✅ Always check permissions first
if (!await _speechService.RequestPermissionsAsync())
return; // Gracefully handle denial
var result = await _speechService.StartListeningAsync(cancellationToken);
⚠️ iOS Requires Both Permission Descriptions
Missing either NSSpeechRecognitionUsageDescription or NSMicrophoneUsageDescription in Info.plist will cause a runtime crash — not a graceful failure.
⚠️ Android: Permission Re-prompt
On Android, if the user denies the RECORD_AUDIO permission twice, the OS stops showing the prompt. You must guide users to Settings manually.
Common Mistakes
❌ No Timeout — Indefinite Listening
Always set a timeout to prevent indefinite listening sessions that drain battery:
// ❌ No timeout — listens forever if no speech detected
await _speechToText.StartListenAsync(options, CancellationToken.None);
// ✅ Use a combined timeout + user cancellation token
using var timeoutCts = new CancellationTokenSource(TimeSpan.FromSeconds(60));
using var combinedCts = CancellationTokenSource.CreateLinkedTokenSource(
userCancellationToken, timeoutCts.Token);
await _speechToText.StartListenAsync(options, combinedCts.Token);
❌ Not Unsubscribing Event Handlers
Leaking event subscriptions causes duplicate processing and memory leaks:
// ❌ Subscribe without unsubscribe
_speechToText.RecognitionResultUpdated += OnRecognitionResultUpdated;
// ✅ Always unsubscribe in finally block
try
{
_speechToText.RecognitionResultUpdated += OnRecognitionResultUpdated;
// ... listen ...
}
finally
{
_speechToText.RecognitionResultUpdated -= OnRecognitionResultUpdated;
}
❌ Not Disposing CancellationTokenSource
// ❌ Leaked CTS
_currentCts = new CancellationTokenSource();
// ✅ Dispose in finally
try { /* ... */ }
finally
{
_currentCts?.Dispose();
_currentCts = null;
}
Platform Pitfalls
| Platform | Pitfall |
|---|---|
| iOS | Missing either plist key → runtime crash |
| Android | User denies permission twice → OS stops prompting; must redirect to Settings |
| All | No timeout → battery drain from indefinite listening |
| All | Calling StartListeningAsync while already listening → returns error, not exception |
Architecture Tips
-
Wrap
ISpeechToTextin a service — Don't useSpeechToText.Defaultdirectly in ViewModels. Wrap inISpeechRecognitionServicefor testability and state management. -
Use partial results for UX — Subscribe to
PartialResultReceivedfor live transcription feedback. Users expect to see words appear as they speak. -
Continuous listening = loop with delay — Loop
StartListeningAsyncwith small delays (Task.Delay(100)) for conversation mode. -
Guard against double-start — Check state before starting:
if (State == SpeechRecognitionState.Listening) return new SpeechRecognitionResultDto { Success = false, ErrorMessage = "Already listening" }; -
Natural language output — CommunityToolkit.Maui returns normalized, punctuated text — not raw phonemes. No post-processing needed for basic use cases.
-
UI-agnostic service — The
ISpeechRecognitionServicepattern works identically with XAML/MVVM, C# Markup, and MauiReactor. Seereferences/speech-to-text-api.mdfor all three patterns.
Checklist
-
CommunityToolkit.MauiNuGet installed (look up current version) -
UseMauiCommunityToolkit()called inMauiProgram.cs -
ISpeechToTextregistered as singleton via DI - iOS: Both
NSSpeechRecognitionUsageDescriptionandNSMicrophoneUsageDescriptionin Info.plist - Android:
RECORD_AUDIOpermission in AndroidManifest.xml - Permissions checked before every
StartListeningAsynccall - Timeout configured (recommend 60 seconds max)
- Event handlers unsubscribed in
finallyblocks -
CancellationTokenSourcedisposed after use