voice-extractor

SKILL.md

Voice Extractor

AI-generated content all sounds the same. The fix isn't better prompts — it's teaching the AI how you actually communicate.

This skill extracts your communication DNA from writing samples and produces a Voice Guide: documented, tested, and ready to use.


Mode

Detect from context or ask: "Quick voice snapshot, full Voice Guide, or full guide with examples?"

Mode What you get Best for
quick Top 5 voice characteristics + 3 do/don't rules Fast style reference, single piece
standard Full Voice Guide: tone, vocabulary, rhythm, structure AI training, ghostwriting, brand documentation
deep Full Voice Guide + 10 sample rewrites + writing rules checklist + AI training examples Onboarding writers, building a brand voice system

Default: standard — use quick if they just need a fast reference. Use deep if they're onboarding a ghostwriter or building a content team.


Context Loading Gates

Before extracting, collect:

  • Writing samples — minimum 3 samples OR 500 total words (see priority list below)
  • Purpose of voice guide — AI training? Ghostwriter onboarding? Team alignment?
  • Confidence zones — Any topics where they want to sound more/less authoritative?
  • Known anti-patterns — Any words or phrases they already know they want to avoid?

Sample priority (most → least authentic):

  1. Casual Slack or email (raw, unedited voice)
  2. Podcast or call transcript
  3. LinkedIn posts or articles
  4. Website copy (often edited, less authentic)

Minimum sample gate: If samples total under 500 words, stop:

"These samples are too short to extract reliable patterns. Please add 2-3 more — emails, Slack messages, or transcripts work best. The messier and more casual, the better."

Do not attempt full extraction from under 500 words. Offer quick mode instead.


Phase 1: Sample Quality Assessment

Before extracting, reason through:

  1. Sample authenticity: Are these samples from edited/polished contexts (website, press) or raw contexts (Slack, email)? More polish = less authentic voice.
  2. Sample variety: Do the samples cover different contexts (professional, casual, educational)? Single-context samples produce single-dimension voice guides.
  3. Exclusion check: Identify and flag patterns that are NOT the authentic voice:
    • Platform formatting tics (LinkedIn line breaks, Twitter brevity forcing)
    • Typos and autocorrect errors
    • Phrases borrowed from others (quotes, retweets)
    • Unusually formal writing (legal docs, press releases)
  4. Sample size adequacy: Is there enough material for full mode, or should I use quick mode?

Output a sample assessment:

"I have [X samples / Y words] to work with. Quality: [high/medium — why]. I'll use [full/quick] mode. Excluding: [any patterns and why]."


Phase 2: Core Energy Extraction

Identify the fundamental communication mode:

Role:

  • Teacher (breaks things down systematically)
  • Challenger (pushes back on assumptions)
  • Cheerleader (builds confidence and momentum)
  • Straight-shooter (cuts through BS efficiently)

Default energy:

  • Calm authority ("Here's what works.")
  • High enthusiasm ("This is exciting — let me show you.")
  • Understated confidence ("I've seen this a hundred times.")

Recurring themes: What topics appear unprompted across samples? These are the things they actually care about.


Phase 3: Phrase Extraction (Systematic)

Scan all samples and extract:

Transition phrases (how they shift topics):

  • Quote exact examples from samples
  • Pattern: "Here's the thing...", "What I've learned...", "Let me put it differently..."

Emphasis phrases (how they land a point):

  • Quote exact examples
  • Pattern: "The reality is...", "This is the part people miss...", "Here's the actual problem..."

Closers (how they wrap up):

  • Quote exact examples
  • Pattern: "That's the move.", "Start there.", "You've got this."

Phase 4: Confidence Zone Mapping

Zone Description Language Markers
Full authority Topics they're an expert in No hedging, definitive statements, "here's what works"
Earned perspective Topics with experience but not mastery "In my experience...", "What I've found..."
Active exploration Topics they're learning now "I'm testing this...", "What I'm seeing..."

Map their stated expertise areas to each zone. This calibration is what makes the voice feel real vs. one-dimensional.


Phase 5: Anti-Pattern Documentation

Extract what they'd NEVER say:

  • Words that would feel wrong in their voice
  • Phrases that make them cringe
  • Tones they naturally avoid
  • Industry jargon they hate

Source these from sample evidence where possible: "You never used [word] across [X samples] — it doesn't fit your voice."


Phase 6: Validation Test (REQUIRED)

After extracting the full profile, generate 2 test sentences on the same topic:

Version A (using the extracted voice profile):

"[Sample sentence in their voice]"

Version B (wrong voice — contrasting example):

"[Same content, different voice — shows what to avoid]"

Ask the user: "Does Version A actually sound like you when you're not overthinking it? What feels off?"

This validation catches extraction errors before the guide is put into production.


Quick Mode (--quick)

When samples are thin (300–500 words) or time is short:

  1. Read 3 samples fast
  2. Pull 10 signature phrases
  3. Note 3 things they'd never say
  4. Write 1 sentence describing their energy

Output: Minimum viable voice guide.

Difference from full mode:

  • Quick: ~10 phrases, 3 anti-patterns, 1-sentence energy descriptor
  • Full: Complete profile with confidence calibration, validated test sentences, and source-cited examples

Phase 7: Self-Critique Pass (REQUIRED)

After generating the Voice Guide:

  • Are the extracted phrases actually from the samples, or am I inferring them?
  • Does the anti-pattern list include specific words/phrases, or just vague categories?
  • Do the validation test sentences demonstrate a real difference between in-voice and out-of-voice?
  • Is the confidence zone mapping specific to named topics, or just generic?
  • Would a ghostwriter be able to use this guide without asking follow-up questions?

Flag any issues: "The anti-pattern section only has 2 entries — not enough for a usable guide. I need more samples or direct input from the user."


Output Structure

## Voice Guide: [Name] — [Date]

### Sample Assessment
- Samples: [count, types]
- Total words: [count]
- Quality: [high/medium — reason]
- Mode: [quick/full]
- Excluded: [patterns excluded + why]

---

### Core Energy
- Role: [teacher/challenger/cheerleader/straight-shooter]
- Default energy: [description]
- Recurring themes: [list]

### Signature Phrases
**Transitions:**
- "[Phrase]" (source: [email/post])
- "[Phrase]"

**Emphasis:**
- "[Phrase]" (source: [email/post])

**Closers:**
- "[Phrase]"

### Confidence Calibration
**Full authority (no hedging):**
Topics: [list]
Sounds like: "[example sentence]"

**Earned perspective:**
Topics: [list]
Sounds like: "[example sentence]"

**Active exploration:**
Topics: [list]
Sounds like: "[example sentence]"

### Anti-Patterns (Never Use)
- [Word/phrase] — why: [evidence from samples]
- [Word/phrase] — why: [evidence]

### Validation Test
**This sounds like you:**
"[Version A]"

**This doesn't:**
"[Version B — contrast]"

### Self-Critique Notes
[Any gaps, things to validate with user]

### Usage Instructions
- For AI: Paste this guide into your system prompt
- For ghostwriter: Share on day 1 — cuts revision cycles in half
- For team: This is the benchmark for "on brand"

Skill by Brian Wagner | AI Marketing Architect | brianrwagner.com

Weekly Installs
26
GitHub Stars
148
First Seen
Feb 15, 2026
Installed on
claude-code26
opencode23
gemini-cli23
github-copilot23
codex23
amp22