humanize-en

Installation
SKILL.md

Humanize EN

Strip AI writing tells from English prose. Preserves meaning, structure, code blocks, links, anchors, and frontmatter — rewrites only the flagged phrasing.

Additional context from the user: $ARGUMENTS

Scope

This skill removes AI slop. It does not inject personality — that can break technical documentation, formal specs, and any neutral-voice register. The goal is a clean, direct, human-edited register, not an opinionated blog post. If the source is an opinion piece and the user explicitly asks for voice, references/voice.md covers the optional voice-calibration pass.

Brand voice integration (optional)

When $ARGUMENTS starts with -f <voice-doc>, load a BRAND-VOICE.md (typically produced by /brand-voice) and apply its brand-specific rules in addition to the universal 32 patterns.

Workflow:

  1. Strip -f <voice-doc> from the head of $ARGUMENTS. The remainder follows the Input modes table below as usual.

  2. Verify <voice-doc> exists with Glob. Missing or unreadable → degrade to default behavior with an explicit warning ("<path> not found — applying universal patterns only"). Never crash.

  3. Resolve the rules by running extract_rules.py --full on the voice doc. The script flattens YAML into plain text, automatically resolves any voice.extends chain, applies _replace and _remove overrides, and emits the merged rule block. Resolution order for the script path:

    1. ${CLAUDE_SKILL_DIR}/../brand-voice/scripts/extract_rules.py (sibling install)
    2. ~/.claude/skills/brand-voice/scripts/extract_rules.py (user-installed brand-voice)
    3. ~/.agents/skills/brand-voice/scripts/extract_rules.py (Anthropic skills directory)

    Invoke via python3 <resolved-script> --full <voice-doc>. If the script exits non-zero, surface the stderr to the user (it carries chain-resolution errors like extends-cycle, extends-depth-exceeded, extends-parent-not-found) and abort the brand-aware pass.

    Fallback — when none of the candidate paths resolve (the user has only humanize-en installed, no brand-voice), emit a one-line warning "brand-voice scripts unavailable; chain resolution skipped" and Read the voice doc directly, parsing only the YAML frontmatter (the block between the leading --- delimiters). The fields are: forbidden_lexicon, required_lexicon, rewrite_rules (each with reject, accept, rule_id), sentence_norms, forbidden_patterns, pronouns, core_attributes, contexts. Skip the prose sections. This fallback path does not resolve voice.extends — if the doc declares it, the warning is doubly visible.

  4. Merge with the 32 universal patterns. Brand rules win on conflict — the user's contract overrides the default catalogue (e.g., a voice that requires em-dashes overrides pattern #14).

  5. Apply the rewrite as usual. Cite both pattern numbers (#14) and brand rule_ids ([no-hedging-imperative]) in the Patterns removed report so the source of each change is traceable.

If the user wants brand-aware rewriting and no voice doc exists, defer: "No BRAND-VOICE.md at <path>. Run /brand-voice extract first."

Input modes

Resolve $ARGUMENTS (after stripping any leading -f <voice-doc>) as follows:

Input shape Behavior
Empty Ask the user to paste text or provide a file path. Do not guess.
Prose file path Read the file. Audit, propose a diff, apply only on explicit approval via Edit.
Non-prose file path Refuse: "Non-prose file — this skill targets prose documents, not structured data or source code." Direct the user to /fix-grammar for docstring grammar, or to rewrite comments manually.
Inline text (anything else) Humanize in place and return the rewritten text in the chat.

Prose extensions (treat as file): .md, .mdx, .txt, .rst, .tex, .html, .adoc.

Non-prose extensions (refuse as file): .json, .yaml, .yml, .toml, .csv, .tsv, .xml, and any source-code file (.py, .ts, .js, .rs, .go, .java, …). Rewriting data or code files would break parsing or semantics even when the rewrite looks harmless.

Classify the first token (use Glob to verify the path exists — stay within allowed-tools, do not shell out):

  • resolves to an existing file AND extension on the prose list → Prose file path (process it)
  • resolves to an existing file AND extension on the non-prose list → Non-prose file path (refuse per the table above)
  • resolves to an existing file AND extension on neither list → ask the user whether to process it as prose or refuse it as non-prose. Do not guess — real cases like CHANGELOG (no extension) or notes.log go here.
  • does not resolve → treat the whole input as inline text

The two middle branches are what actually prevent data / source-code / unknown files from being silently humanized as inline strings.

Process

  1. Read fully — the whole text, not one paragraph at a time. Patterns compound across sentences (rule-of-three + synonym cycling + promotional tone often ride together).
  2. Prescan mechanically — for file inputs, run ${CLAUDE_SKILL_DIR}/scripts/prescan.py <file> (or pipe inline text via -). It emits a JSON hit-list for the 8 highest-signal mechanical patterns (#1, #4, #7, #8, #9, #14, #23, #28). Start the rewrite from the flagged lines. Subjective patterns (tone, rule-of-three in context, vague attributions) stay LLM-only.
  3. Full detect pass — scan against the 32 patterns in references/patterns.md. The prescan catches roughly 60-70% of real hits; the full catalog catches the rest.
  4. Draft rewrite — replace flagged phrasing with direct, specific alternatives. Keep sentence-level meaning intact. See Preservation rules below for what stays verbatim and what may still be adjusted.
  5. Self-audit — ask: "What still reads as obviously AI-generated?" List remaining tells in 2–4 bullets. Revise.
  6. Report — present the final rewrite plus a short changelog of which patterns were touched (by number from the catalog). For file inputs, propose the diff and wait for approval before Edit.

Quick reference — the 10 highest-signal tells

Roughly 90% of real AI slop comes from this subset. The 8 mechanical patterns (#1, #4, #7, #8, #9, #14, #23, #28) are pre-flagged by prescan.py; #3 and #10 stay LLM-only — too context-dependent for regex. Full catalog with before/after examples is in references/patterns.md — consult it when a hit needs context or you are unsure whether to flag.

# Pattern Instead
1 Significance inflation — "pivotal moment", "testament to", "evolving landscape" State the fact directly.
3 Superficial -ing — "…reflecting broader trends", "…underscoring the importance" End the sentence; drop the participial coda.
4 Promotional — "nestled", "breathtaking", "vibrant", "stunning" Neutral description with a concrete detail.
7 AI vocabulary — delve, tapestry, intricate, pivotal, testament, underscore, crucial, garner, showcase, vibrant, interplay, align with, additionally, moreover, furthermore, indeed Plain-English equivalent or delete.
8 Copula avoidance — "serves as", "stands as", "features", "boasts" Use is/are/has.
9 Negative parallelism — "It's not just X, it's Y" Direct affirmative sentence.
10 Rule of three — three-item lists where two or four would be honest Use the real count.
14 Em-dash overuse Prefer commas or periods unless the dash does real work.
23 Filler phrases — "in order to", "it is important to note that", "at this point in time" Delete or contract.
28 Signposting — "Let's dive in", "Here's what you need to know", "Without further ado" Just say the thing.

Preservation rules

The rewrite must NOT change:

  • Code — anything inside backticks, fenced code blocks, or <code>…</code>.
  • URLs and anchors — the (url) portion of [text](url), #anchor refs, image paths.
  • Frontmatter — YAML/TOML blocks at file top.
  • Quoted material — text inside "…" attributed to a person or source.
  • Technical terms, proper nouns, product names — even when they match an "AI vocabulary" flag in other contexts (e.g., a product literally named "Tapestry" is not a pattern-7 hit).
  • Structural markers — heading levels, list depth, table columns, HTML tag syntax (tag names and attribute names). Rewrite the prose inside the structure; do not restructure.
  • Factual claims — if a sentence states a number, date, or attribution, preserve it verbatim even when the surrounding clause is rewritten.

May be adjusted — link text inside […] is prose and can be rewritten when it carries AI tells (e.g., [delve into the transformative landscape][read more]). HTML attributes that contain prose (alt, title, aria-label) follow the same principle.

When in doubt, keep the original token and only adjust the connective tissue around it.

Output format

For inline text

## Rewrite

<humanized text>

## Patterns removed

- #N <pattern name> — <short note, e.g., "4 instances, em-dashes converted to commas">
- ...

For file paths

## Diff preview

<unified-diff-style or before/after blocks for changed passages>

## Patterns removed

- #N <pattern name> — <count>
- ...

Apply? (yes/no)

Apply only on explicit yes from the user. When another skill invokes /humanize-en on a file, the approval prompt still flows to the end user — a parent skill must not auto-answer on their behalf.

Rules

Everything not listed below is already enforced by Process and Preservation rules above.

  • Never inject first-person voice, opinions, or colloquial hedges into neutral registers (docs, specs, formal READMEs, release notes). The source voice wins; only the AI tells go.
  • Never drop a sentence entirely unless it is pure chatbot artifact (e.g., "I hope this helps!", "Let me know if you'd like me to expand on any section"). Every other sentence gets rewritten, not deleted.
  • One pass only — do not recurse. If the user wants a second round, they ask.
  • Match the source register — a commit message stays terse, a release note stays bulleted, a README paragraph stays prose.

When to defer to another skill

  • Pure spelling or grammar errors → /fix-grammar.
  • Structural problems (wrong headings, missing TOC, collapse patterns) → /write-clear-readme.
  • Define, update, or inspect a brand voice doc → /brand-voice extract|update|diff|show. This skill consumes the voice doc via -f; /brand-voice produces it.
  • The text is in a non-English language → stop and tell the user; this skill is English-only by design.

Reference

  • references/patterns.md — full 32-pattern catalogue with before/after examples. Load when a hit needs context or a reviewer asks why a phrase was flagged.
  • references/voice.md — optional voice calibration for opinion pieces or personal writing. Load only when the user explicitly asks for voice, personality, or a sample-matching pass.
  • references/schemas.md — JSON shapes for prescan hit lists, eval samples, and eval results. Consult when editing any script that produces structured output.
  • scripts/prescan.py — regex-based pre-scan emitting a JSON hit-list for the 8 highest-signal mechanical patterns. Python 3.7+, no third-party deps. Called in Process step 2 above.
  • scripts/utils.py — shared I/O helpers (read_text, read_json, write_json, mask_protected_regions, seeded_rng) used by the other scripts. Not invoked directly.
  • scripts/eval_patterns.py — runs prescan over the eval corpus (eval-corpus/samples/*.json), scores per-sample pass/fail, emits a JSON report per references/schemas.md § eval result. Exit 0 on full pass, 1 on any failure. Run before editing prescan patterns to baseline current coverage, then re-run to confirm no regression.
Related skills

More from coroboros/agent-skills

Installs
1
First Seen
9 days ago