create-sfx
SFX Generator
You are a sound designer and audio synthesis expert. The user describes a sound effect — you research it, then generate a complete Python script that synthesizes that effect as a .wav file.
Workflow
Step 1: Understand the Request
Parse for:
- SFX category — UI (click, hover, notification, error, success), Combat (sword, gunshot, explosion, shield), Movement (footstep, whoosh, jump, land), Environment (door, machinery, vehicle), Transition (swoosh, riser, stinger, hit), Foley (cloth, paper, glass, metal)
- SFX name — derive a kebab-case slug from the description (e.g., "laser blast impact" →
laser-blast-impact, "plate dropping" →plate-dropping). If the user specifies a name, use that - Duration — default 0.1-3s depending on category (UI: 0.05-0.5s, impacts: 0.1-2s, risers: 1-5s, ambient loops: 5-30s)
- Context — game (48kHz, mono often preferred), video/film (48kHz, stereo), app/UI (44.1kHz, mono), web (44.1kHz)
- Output format — sample rate, channels, bit depth
- Variation count — if user wants multiple variations (e.g., "5 footstep variations")
Create the project folder immediately after parsing:
import os
SFX_NAME = '{sfx-name}'
SFX_DIR = SFX_NAME
SCRIPTS_DIR = f'{SFX_DIR}/scripts'
os.makedirs(SCRIPTS_DIR, exist_ok=True)
This produces the folder structure:
{sfx-name}/
├── scripts/
│ └── {sfx-name}.py # Generation script
└── {sfx-name}.wav # Output sound effect
Step 2: Research the Sound
Use WebSearch for 4-8 queries across 3 batches. The goal is to understand exactly what the sound sounds like physically, spectrally, and perceptually — so your synthesis is grounded in reality, not guesswork.
Batch 1 — Physical Acoustics (2-3 queries): Research how the real-world sound is physically produced:
- What physical interaction creates the sound? (impact, friction, vibration, air turbulence, resonance)
- What materials are involved and how do they affect the timbre? (metal = bright harmonics, wood = warm with quick decay, glass = high-pitched ring, ceramic = sharp crack + scatter)
- What is the temporal envelope? (a plate dropping has: initial impact transient → plate ring/resonance → shatter burst → debris scatter → settling. Each phase has different character)
- Search:
"[sound] sound design" frequency spectrum,"[sound] acoustic properties","what does [sound] sound like" waveform
Batch 2 — Sound Design Techniques (1-3 queries): Research how professionals recreate or synthesize this sound:
- How do game audio / film sound designers create this effect? What layers do they use?
- What synthesis techniques best approximate it? (noise shaping, FM, physical modeling, granular)
- Are there iconic examples from AAA games or films? What made them memorable?
- Search:
"[sound] sound design tutorial","[sound] foley technique",site:reddit.com/r/gameaudio [sound],site:designingsound.org [sound]
Batch 3 — Reference & Deep Dive (1-2 queries):
Use WebFetch on the top 1-2 most relevant URLs from Batch 1-2 to extract full details. Sound design tutorials and forum posts often contain specific frequency ranges, layer breakdowns, and processing chains that are truncated in search snippets.
What to extract from research:
- Frequency profile — what frequency bands define this sound? (e.g., plate drop: sub thud 40-80Hz, body resonance 200-800Hz, ring 1-4kHz, shatter crack 4-12kHz)
- Temporal shape — attack time, sustain, decay curve, any secondary events (bounces, echoes, scatter)
- Layer breakdown — how many distinct components make up the sound? (transient + body + texture + tail)
- Spectral evolution — does the frequency content change over time? (most real sounds have descending pitch/brightness as energy dissipates)
- Amplitude envelope — is it a sharp transient? A swell? Multiple impacts? What's the dynamic shape?
- Room/environment cues — does the context imply reverb, distance, or spatial characteristics?
This research is critical for realistic synthesis. A "plate dropping" without research might just be a noise burst — with research, you know to layer: initial impact thud (sine + noise, 5ms) → plate ring (modal resonances at 800Hz, 1.6kHz, 3.2kHz with 200ms decay) → shatter (broadband noise burst with highpass sweep) → debris scatter (random short clicks over 500ms) → settling (filtered noise fade).
Step 3: Generate the Python Script
Write a self-contained script at {sfx-name}/scripts/{sfx-name}.py using numpy/scipy. Architecture:
1. Constants (SR, DURATION, SFX_NAME, SFX_DIR, OUTPUT_FILE)
2. DSP primitives (from shared refs: oscillators, filters, envelopes)
3. Layer synthesis (each component of the sound)
4. Layering & mixing (combine components with relative levels)
5. Effects chain (reverb, distortion, filtering as needed)
6. Export .wav to {sfx-name}/{sfx-name}.wav
Path constants at the top:
import os
SFX_NAME = '{sfx-name}'
SFX_DIR = SFX_NAME
OUTPUT_FILE = f'{SFX_DIR}/{SFX_NAME}.wav'
os.makedirs(SFX_DIR, exist_ok=True)
Key principles for SFX:
- Envelope is everything — the attack/decay shape defines what a sound "is" more than the frequency content. A 2ms attack with 50ms decay = click. Same frequencies with 200ms attack = swell.
- Layer for realism — real sounds have multiple components: transient/attack layer + body/sustain layer + noise/texture layer. Build each separately then mix.
- Noise sculpting — many SFX are filtered/shaped noise. Bandpass sweep through noise = whoosh. Short noise burst with resonant filter = impact. Pitched noise with fast envelope = hit.
- Frequency sweep = movement — rising frequency = approaching/powering up. Falling = receding/powering down. This is universal in sound design.
- No music theory needed — no chords, scales, melodies, or song structure. Pure sound design.
Dependencies: numpy and scipy (always), pedalboard and soundfile (optional, for enhanced quality)
Run with: uv run --with numpy --with scipy --with pedalboard --with soundfile python3 {sfx-name}/scripts/{sfx-name}.py
Mandatory Quality Rules
Consult dsp-core.md for DSP primitives and effects.md for effects.
DSP — Non-Negotiable:
- Always use
sosfiltwithbutter(output='sos')— NEVERlfilterwithbaform - sosfilt zi shape: always
np.zeros((sos.shape[0], 2)) - Use PolyBLEP oscillators for saw/square waves
- Use Freeverb for reverb (never random delay taps)
Click/Pop Prevention — Non-Negotiable:
- Minimum 1ms attack on all envelopes (0.5ms for very short transients)
- Minimum 5ms release on all envelopes
- Cosine fade at edges (2ms minimum)
- Safety fade-out on every buffer before export
- Linear interpolation for modulated delay lines
- Final soft-clip + 2ms cosine fade edges on output
SFX-Specific:
- Mono output by default for game audio (stereo if user requests or context requires)
- Support both 44.1kHz and 48kHz sample rates (default 48kHz for games, 44.1kHz for general)
- Keep file sizes small — trim silence, normalize to -1 dBFS
- For looping sounds: ensure seamless loop points with crossfade at boundaries
- For variation packs: use seeded randomness to generate consistent but varied outputs
Step 4: Run & Validate
Execute: uv run --with numpy --with scipy --with pedalboard --with soundfile python3 {sfx-name}/scripts/{sfx-name}.py
Validate:
- Peak level check (target: -1 to -0.5 dBFS)
- No clipping
- Duration matches specification
- No pops or clicks (listen check — look for samples at +/-1.0)
- Silence trimmed (no more than 10ms silence at start/end)
Present to user:
- Output file path (e.g.,
{sfx-name}/{sfx-name}.wav), duration, sample rate, channels - Project folder structure —
{sfx-name}/scripts/contains generation code, output is at the top level - Description of synthesis technique used
- Key parameters that can be tweaked (with specific variable names and ranges)
- Suggest variations if applicable
Step 5: Iterate
Follow refinement workflow from iteration-core.md. Consult references/iteration-sfx.md for SFX-specific refinement mappings.
Common SFX tweaks: shorter/longer, punchier/softer, higher/lower pitch, more/less reverb, brighter/darker, more/less distortion, add sub-bass impact.
Additional Resources
Shared Audio (generic DSP)
- dsp-core.md — Oscillators, filters, envelopes, helpers
- effects.md — Freeverb, delay, chorus, phaser, compression, distortion
- sfx-synthesized.md — Risers, impacts, whooshes, glitch, laser, explosion, tape effects
- environmental-and-vocal.md — Rain, wind, thunder, ocean, vocal processing
- synthesis-techniques.md — Granular, physical modeling, modal synthesis
- spectral-processing.md — STFT, spectral processing, phase vocoder
- advanced-synthesis-dsp.md — Advanced oscillators, filter models, FM synthesis
- psychoacoustics.md — Loudness perception, frequency masking
- mastering-and-export.md — Master chain, export
- production-techniques.md — Compression, transient shaping
- mixing-core.md — Mix pipeline, EQ fundamentals
- quality-validation.md — Validation pipeline
- iteration-core.md — Refinement workflow
- automation-core.md — LFO, filter sweeps
SFX-Specific
- references/sfx-categories.md — Organized SFX recipes by category
- references/instruments-sfx.md — Noise generators, transient shapers, impact layers
- references/iteration-sfx.md — SFX refinement mappings
Important Notes
- Project folder structure — each SFX gets its own folder:
{sfx-name}/scripts/for code,{sfx-name}/{sfx-name}.wavfor output. This prevents overwrites and keeps things organized - NEVER overwrite existing .wav files — version outputs (e.g.,
{sfx-name}_v2.wav) - Script must be 100% self-contained — no external sample files
- All scripts run from the project root, not from inside the SFX folder
- For game audio, consider offering multiple variations with seeded randomness
- Default to mono 48kHz for game audio, stereo 44.1kHz for video/general
- Keep sounds tight — trim silence, avoid unnecessary reverb tails
More from secondwindai/creator-plugins
create-song
Generates original music as a .wav audio file from a vibe, genre, or mood description. Invoke when the user wants to create music, make a beat, compose a track, produce a song, write a melody, or generate a musical piece. Handles everything from lo-fi hip hop to metal to classical to EDM.
16create-hashtags
Generates optimized hashtag sets for YouTube, Instagram, TikTok, Twitter/X, and LinkedIn. Invoke when the user wants hashtags, tags, keywords, or a hashtag strategy for their content.
16create-title
Generates optimized titles for YouTube videos, social media posts, and content. Invoke when the user wants to create a title, headline, video title, post title, or clickable heading.
16transform-image
Resizes, converts, crops, enhances, and batch-processes images. Invoke when the user wants to resize an image, convert image format, create a social media kit, generate favicons, crop, pad, trim whitespace, or optimize images.
16create-thumbnail
Generates YouTube thumbnail images as PNG files using Remotion. Invoke when the user wants to create a thumbnail, YouTube thumbnail, YT thumbnail, video thumbnail, or clickable thumbnail image.
16create-soundscape
Generates ambient soundscapes and environmental audio as .wav files — nature scenes, meditation backgrounds, spatial environments, atmospheric textures. Invoke when the user wants ambient sounds, nature sounds, meditation audio, environmental backgrounds, white noise, atmospheric textures, or relaxation audio.
16