audio-voice-recovery
Forensic Audio Research Audio Voice Recovery Best Practices
Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.
When to Apply
Reference these guidelines when:
- Recovering voice from noisy or low-quality recordings
- Enhancing audio for transcription or legal evidence
- Performing forensic audio authentication
- Analyzing recordings for tampering or splices
- Building automated audio processing pipelines
- Transcribing difficult or degraded speech
Rule Categories by Priority
| Priority | Category | Impact | Prefix | Rules |
|---|---|---|---|---|
| 1 | Signal Preservation & Analysis | CRITICAL | signal- |
5 |
| 2 | Noise Profiling & Estimation | CRITICAL | noise- |
5 |
| 3 | Spectral Processing | HIGH | spectral- |
6 |
| 4 | Voice Isolation & Enhancement | HIGH | voice- |
7 |
| 5 | Temporal Processing | MEDIUM-HIGH | temporal- |
5 |
| 6 | Transcription & Recognition | MEDIUM | transcribe- |
5 |
| 7 | Forensic Authentication | MEDIUM | forensic- |
5 |
| 8 | Tool Integration & Automation | LOW-MEDIUM | tool- |
7 |
Quick Reference
1. Signal Preservation & Analysis (CRITICAL)
signal-preserve-original- Never modify original recordingsignal-lossless-format- Use lossless formats for processingsignal-sample-rate- Preserve native sample ratesignal-bit-depth- Use maximum bit depth for processingsignal-analyze-first- Analyze before processing
2. Noise Profiling & Estimation (CRITICAL)
noise-profile-silence- Extract noise profile from silent segmentsnoise-identify-type- Identify noise type before reductionnoise-adaptive-estimation- Use adaptive estimation for non-stationary noisenoise-snr-assessment- Measure SNR before and afternoise-avoid-overprocessing- Avoid over-processing and musical artifacts
3. Spectral Processing (HIGH)
spectral-subtraction- Apply spectral subtraction for stationary noisespectral-wiener-filter- Use Wiener filter for optimal noise estimationspectral-notch-filter- Apply notch filters for tonal interferencespectral-band-limiting- Apply frequency band limiting for speechspectral-equalization- Use forensic equalization to restore intelligibilityspectral-declip- Repair clipped audio before other processing
4. Voice Isolation & Enhancement (HIGH)
voice-rnnoise- Use RNNoise for real-time ML denoisingvoice-dialogue-isolate- Use source separation for complex backgroundsvoice-formant-preserve- Preserve formants during pitch manipulationvoice-dereverb- Apply dereverberation for room echovoice-enhance-speech- Use AI speech enhancement services for quick resultsvoice-vad-segment- Use VAD for targeted processingvoice-frequency-boost- Boost frequency regions for specific phonemes
5. Temporal Processing (MEDIUM-HIGH)
temporal-dynamic-range- Use dynamic range compression for level consistencytemporal-noise-gate- Apply noise gate to silence non-speech segmentstemporal-time-stretch- Use time stretching for intelligibilitytemporal-transient-repair- Repair transient damage (clicks, pops, dropouts)temporal-silence-trim- Trim silence and normalize before export
6. Transcription & Recognition (MEDIUM)
transcribe-whisper- Use Whisper for noise-robust transcriptiontranscribe-multipass- Use multi-pass transcription for difficult audiotranscribe-segment- Segment audio for targeted transcriptiontranscribe-confidence- Track confidence scores for uncertain wordstranscribe-hallucination- Detect and filter ASR hallucinations
7. Forensic Authentication (MEDIUM)
forensic-enf-analysis- Use ENF analysis for timestamp verificationforensic-metadata- Extract and verify audio metadataforensic-tampering- Detect audio tampering and splicesforensic-chain-custody- Document chain of custody for evidenceforensic-speaker-id- Extract speaker characteristics for identification
8. Tool Integration & Automation (LOW-MEDIUM)
tool-ffmpeg-essentials- Master essential FFmpeg audio commandstool-sox-commands- Use SoX for advanced audio manipulationtool-python-pipeline- Build Python audio processing pipelinestool-audacity-workflow- Use Audacity for visual analysis and manual editingtool-install-guide- Install audio forensic toolchaintool-batch-automation- Automate batch processing workflowstool-quality-assessment- Measure audio quality metrics
Essential Tools
| Tool | Purpose | Install |
|---|---|---|
| FFmpeg | Format conversion, filtering | brew install ffmpeg |
| SoX | Noise profiling, effects | brew install sox |
| Whisper | Speech transcription | pip install openai-whisper |
| librosa | Python audio analysis | pip install librosa |
| noisereduce | ML noise reduction | pip install noisereduce |
| Audacity | Visual editing | brew install audacity |
Workflow Scripts (Recommended)
Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.
scripts/preflight_audio.py- Generate a forensic preflight report (JSON or Markdown).scripts/plan_from_preflight.py- Create a workflow plan template from the preflight report.scripts/compare_audio.py- Compare objective metrics between baseline and processed audio.
Example usage:
# 1) Analyze and capture baseline metrics
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
# 2) Generate a workflow plan template
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md
# 3) Compare baseline vs processed metrics
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
--before evidence.wav \
--after enhanced.wav \
--format md \
--out comparison.md
Forensic Preflight Workflow (Do This Before Any Changes)
Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001).
Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence.
Use scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.
Capture and record before processing:
- Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
- Record signal integrity: sample rate, bit depth, channels, duration
- Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
- Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
- Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
- Locate the region of interest (ROI) and document time ranges and changes over time
- Inspect spectral content and estimate speech-band energy and intelligibility risk
- Scan for temporal defects: dropouts, discontinuities, splices, drift
- Evaluate channel correlation and phase anomalies (if stereo)
- Extract and preserve metadata: timestamps, device/model tags, embedded notes
Procedure:
- Prepare a forensic working copy, verify hashes, and preserve the original untouched.
- Locate ROI and target signal; document exact time ranges and changes across the recording.
- Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
- Identify required processing and plan a workflow order that avoids unwanted artifacts.
Generate a plan draft with
scripts/plan_from_preflight.pyand complete it with case-specific decisions. - Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
- Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
- Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
- If stereo, evaluate channel correlation and phase; document anomalies.
- Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.
Failure-pattern guardrails:
- Do not process until every preflight field is captured.
- Document every process, setting, software version, and time segment to enable repeatability.
- Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
- Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
- Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
- Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
- If the request is not achievable, communicate limitations and do not declare completion.
- Require objective metrics and A/B listening before declaring completion.
- Do not rely solely on objective metrics; corroborate with critical listening.
- Take listening breaks to avoid ear fatigue during extended reviews.
Quick Enhancement Pipeline
# 1. Analyze original (run preflight and capture baseline metrics)
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
# 2. Create working copy with checksum
cp evidence.wav working.wav
sha256sum evidence.wav > evidence.sha256
# 3. Apply enhancement
ffmpeg -i working.wav -af "\
highpass=f=80,\
adeclick=w=55:o=75,\
afftdn=nr=12:nf=-30:nt=w,\
equalizer=f=2500:t=q:w=1:g=3,\
loudnorm=I=-16:TP=-1.5:LRA=11\
" enhanced.wav
# 4. Transcribe
whisper enhanced.wav --model large-v3 --language en
# 5. Verify original unchanged
sha256sum -c evidence.sha256
# 6. Verify improvement (objective comparison + A/B listening)
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
--before evidence.wav \
--after enhanced.wav \
--format md \
--out comparison.md
How to Use
Read individual reference files for detailed explanations and code examples:
- Section definitions - Category structure and impact levels
- Rule template - Template for adding new rules
Reference Files
| File | Description |
|---|---|
| AGENTS.md | Complete compiled guide with all rules |
| references/_sections.md | Category definitions and ordering |
| assets/templates/_template.md | Template for new rules |
| metadata.json | Version and reference information |