book-to-skill
Book-to-Skill Converter
Transform written knowledge into actionable Claude Code skills by extracting structure — not producing summaries.
Philosophy
Books contain crystallized expertise: frameworks, principles, and techniques that took years to develop. This skill extracts that knowledge into a format Claude can leverage repeatedly.
Extract structure, not summaries. A skill isn't a book report. It's a toolkit of:
- Named frameworks (mental models with clear application)
- Actionable principles (rules that guide decisions)
- Techniques (step-by-step methods)
- Anti-patterns (what to avoid and why)
- Voice calibration (how the author thinks and communicates)
Preserve the author's precision. Frameworks often have specific names for reasons. "The 5 Whys" isn't interchangeable with "ask why multiple times." Capture the exact formulation.
Layer depth appropriately. Simple books → simple skills. Complex books with 10+ frameworks → skills with reference files and on-demand chapters.
Modes of Operation
Three paths available. Route based on what the user asks:
1. Full Conversion (Default)
Trigger: User provides a PDF path without special instructions Action: Run all steps below (Steps 0–9) Output: Complete skill with SKILL.md, chapters/, glossary, patterns, cheatsheet
2. Analyze Only
Trigger: User says "analyze", "just extract", or "I want to review before generating" Action: Run Steps 0–3, then produce a structured extraction report (frameworks, principles, techniques found). Stop — do NOT generate skill files. Output: Analysis report for user review
3. Generate from Prior Analysis
Trigger: User has existing analysis notes or previously ran analyze-only Action: Skip Steps 0–3, use the provided analysis as input, run Steps 4–9 Output: Skill files from the provided analysis
Step 0 — Out-of-scope check
If the argument is NOT a path to a PDF or EPUB file, stop and respond:
"book-to-skill requires a PDF or EPUB path. Usage:
/book-to-skill /path/to/book.pdf [skill-name]or/book-to-skill /path/to/book.epub [skill-name]"
Step 1 — Validate input
test -f "$0" && echo "FILE_OK" || echo "FILE_NOT_FOUND: $0"
file "$0" | grep -iE "pdf|epub|zip" && echo "FORMAT_OK" || echo "FORMAT_UNKNOWN"
Check the file extension (.pdf or .epub) or magic bytes (%PDF or PK zip header).
If the file is not found or the format is not supported, stop with a clear error message listing supported formats.
Step 1.5 — Identify book type
Before extracting, ask the user:
"What kind of content does this book have? This helps me choose the best extraction method.
- Technical — has code blocks, tables, formulas, diagrams (e.g. programming books, academic papers, architecture guides)
- Text-heavy — mostly prose, few or no tables/code (e.g. management, productivity, narrative non-fiction)
- Not sure — I'll use the fast method and warn you if quality seems limited"
Store the answer as BOOK_TYPE:
- Option 1 →
BOOK_TYPE=technical - Option 2 →
BOOK_TYPE=text - Option 3 →
BOOK_TYPE=text
If BOOK_TYPE=technical, inform the user before proceeding:
"📐 Technical mode selected — using Docling for structure-aware extraction (tables, code blocks, formulas preserved as markdown). This takes ~1.5s per page, so expect a few minutes for longer books. Starting now…"
If BOOK_TYPE=text, inform:
"📄 Text mode selected — using fast extraction (pdftotext). Ready in seconds."
Step 2 — Extract text from PDF or EPUB
Run the extraction script, passing the book type:
python3 ~/.claude/skills/book-to-skill/scripts/extract.py "$0" --mode <BOOK_TYPE>
--mode technical→ uses Docling (layout-aware, preserves tables and code blocks as markdown)--mode text→ uses pdftotext → PyPDF2 → pdfminer fallback chain (fast, plain text)
This creates:
/tmp/book_skill_work/full_text.txt— full extracted text/tmp/book_skill_work/metadata.json— title, estimated pages, token count, size, extraction_mode
Read /tmp/book_skill_work/metadata.json to understand what was extracted.
Step 2.5 — Pre-flight cost estimate
Read /tmp/book_skill_work/metadata.json and present the user with an estimate before doing any generation:
📖 Book detected: <filename> (<format: PDF or EPUB>)
📄 Pages/Spine items: ~<N> | Words: ~<N> | Source tokens: ~<N>K
💰 Estimated token cost (Full Conversion):
Input (book reading + prompts): ~<N>K tokens
Output (skill files generated): ~<N>K tokens
Total: ~<N>K tokens
Reference prices (as of 2025):
Claude Sonnet 4.5 → ~$<X> USD
Claude Haiku 4.5 → ~$<X> USD
⏱ Estimated time: ~<N> minutes
📁 Files to be generated:
SKILL.md + <N> chapter files + glossary + patterns + cheatsheet
➡ Proceed with Full Conversion? (or type "analyze only" to preview first)
How to estimate:
- Input tokens ≈
estimated_tokensfrom metadata × 1.3 (prompts overhead per chapter pass) - Output tokens ≈ chapters × 1,000 + 4,000 (SKILL.md) + 4,500 (glossary + patterns + cheatsheet)
- Price: Sonnet input=$3/MTok output=$15/MTok — Haiku input=$0.80/MTok output=$4/MTok
Wait for the user to confirm before proceeding. If they say "analyze only", switch to Mode 2.
Step 3 — Analyze book structure
Read the first 8,000 characters of /tmp/book_skill_work/full_text.txt to identify:
- Book title and author(s)
- Chapter structure (look for "Chapter N", "PART I", numbered headings, table of contents)
- Core themes and subject domain
- Approximate number of chapters
Then read the Table of Contents section if present to map all chapters.
If mode is "Analyze Only": produce the extraction report now and stop. Structure:
## Extraction Report — <Title>
### Author's Core Frameworks
- **<Framework Name>**: <what it is and when to apply>
### Key Principles
- <Principle>: <actionable rule>
### Techniques & Methods
- <Technique>: <step-by-step or how-to>
### Anti-patterns
- <What to avoid>: <why>
### Suggested Skill Name
`{author-lastname}-{core-concept}` — e.g. `cialdini-influence`
### Chapters Detected
| # | Title | Main Frameworks |
Step 4 — Ask purpose (Full Conversion only)
Before generating, ask the user:
"What should this skill help you do? (Pick one or more)
- Apply the author's frameworks while working
- Think with the author's mental models
- Reference specific chapters and concepts
- All of the above"
Use the answer to weight what gets highlighted in the SKILL.md Core section.
Step 5 — Determine skill name
If $1 was provided, use it as the skill slug.
Otherwise, propose two options and let the user choose:
- By author-concept:
{author-lastname}-{core-concept}(e.g.cialdini-influence,meadows-systems) - By title: lowercase hyphens from book title (e.g.
designing-data-intensive-apps)
Default to author-concept format if the book has a strong methodological identity.
Check that ~/.claude/skills/<skill_name>/ does NOT already exist.
If it does, append -2 or ask the user before overwriting.
Step 6 — Create skill directory structure
mkdir -p ~/.claude/skills/<skill_name>/chapters
Step 7 — Generate chapter summaries
TOKEN BUDGET RULE — CRITICAL:
- Each chapter summary file: 800–1,200 tokens (dense, not verbose)
- Files are loaded on-demand — they are NOT capped per se, but keep them useful and tight
For EACH chapter/major section identified in Step 3:
Read the corresponding section of /tmp/book_skill_work/full_text.txt (use character offsets or grep for chapter headings).
Create ~/.claude/skills/<skill_name>/chapters/ch<NN>-<slug>.md using the structure below.
Adapt emphasis based on BOOK_TYPE:
technical→ prioritize "Code Examples", "Reference Tables", and "Commands & APIs" sections; preserve exact syntaxtext→ prioritize "Frameworks Introduced", "Mental Models", and "Key Takeaways"; skip empty technical sections
# Chapter N: <Full Title>
## Core Idea
<1–2 sentences: the single most important thing this chapter teaches>
## Frameworks Introduced
- **<Framework Name>**: <exact formulation — preserve the author's naming>
- When to use: <specific situation>
- How: <steps or criteria>
## Key Concepts
- **<Term>**: <precise definition in 1 sentence>
(5–10 most important terms from this chapter)
## Mental Models
<2–4 frameworks or thinking tools. Write as "Use X when Y" or "Think of X as Y">
## Anti-patterns
- **<What to avoid>**: <why it fails>
## Code Examples *(technical books only — omit if BOOK_TYPE=text)*
<!-- Copy the most instructive snippet from the chapter. Preserve indentation exactly. -->
```<language>
<key code example from this chapter>
- What it demonstrates:
Reference Tables (technical books only — omit if BOOK_TYPE=text)
Key Takeaways
(3–7 takeaways a practitioner must remember)
Connects To
- Ch N:
- :
---
## Step 8 — Generate supporting files
### glossary.md
Create `~/.claude/skills/<skill_name>/glossary.md`:
- Every significant term from the book, alphabetically sorted
- Format: `**Term** — definition (Ch N)`
- Max 1,500 tokens
### patterns.md
Create `~/.claude/skills/<skill_name>/patterns.md`:
- All concrete techniques, design patterns, algorithms from the book
- Format: `## Pattern Name\n**When to use**: ...\n**How**: ...\n**Trade-offs**: ...`
- Max 2,000 tokens
### cheatsheet.md
Create `~/.claude/skills/<skill_name>/cheatsheet.md`:
- Decision tables, comparison matrices, quick-reference rules
- The content you'd want on a single printed page
- Max 1,000 tokens
---
## Step 9 — Generate the master SKILL.md
**CRITICAL TOKEN BUDGET: Keep SKILL.md body under 4,000 tokens.**
Compaction truncates from the END — put the most important content FIRST.
Create `~/.claude/skills/<skill_name>/SKILL.md`:
```markdown
---
name: <skill_name>
description: Knowledge base from "<Full Title>" by <Author(s)>. Use when applying <author>'s frameworks for <key topics, 3–6 terms>.
when_to_use: <10–15 trigger phrases based on book topics and terms. Comma-separated.>
allowed-tools: Read Grep
argument-hint: [topic, framework name, or chapter number]
---
# <Full Title>
**Author**: <Author(s)> | **Pages**: ~<N> | **Chapters**: <N> | **Generated**: <YYYY-MM-DD>
## How to Use This Skill
- **Without arguments** — `/skill-name` loads core frameworks for reference
- **With a topic** — `/skill-name replication` → I find and read the relevant chapter
- **With chapter** — `/skill-name ch05` → I load that specific chapter
- **Browse** — ask "what chapters do you have?" to see the full index
When you ask about a topic not covered in Core Frameworks below, I will read
the relevant chapter file before answering.
---
## Core Frameworks & Mental Models
<!-- ~2,000 tokens: the author's most important named frameworks and principles.
Preserve exact names. Write as "Use X when Y", "Prefer X over Y because Z".
This is a toolkit, not a summary. -->
<generate 2,000 tokens of the most critical frameworks and insights here>
---
## Chapter Index
| # | Title | Key Frameworks |
|---|-------|----------------|
| [ch01](chapters/ch01-<slug>.md) | <Title> | <framework1>, <framework2> |
| [ch02](chapters/ch02-<slug>.md) | <Title> | <framework1>, <framework2> |
...
## Topic Index
<!-- Alphabetical. Major terms/frameworks → chapter(s) that cover them. -->
- **<Term>** → ch<N>[, ch<N>]
- **<Term>** → ch<N>
## Supporting Files
- [glossary.md](glossary.md) — all key terms with definitions
- [patterns.md](patterns.md) — all techniques and design patterns
- [cheatsheet.md](cheatsheet.md) — quick reference tables and decision guides
---
## Scope & Limits
This skill covers the book content only. For hands-on implementation in your codebase,
combine with project-specific tools. For topics beyond this book, check related skills
or ask Claude directly.
Step 10 — Cleanup and report
rm -rf /tmp/book_skill_work
Then report to the user:
✅ Skill created: ~/.claude/skills/<skill_name>/
📚 Book: <Full Title> — <Author>
📄 Pages: ~<N> | Chapters: <N>
Files generated:
SKILL.md — core frameworks + index (~X tokens)
chapters/ — <N> chapter summaries (~X tokens each, ~X total)
glossary.md — key terms (~X tokens)
patterns.md — techniques & patterns (~X tokens)
cheatsheet.md — quick reference (~X tokens)
─────────────────────────────────────────────────────
Total skill size: ~X tokens (loaded on-demand, not all at once)
💡 Tip: run /cost in Claude Code to see the actual token usage for this session.
Usage:
/<skill_name> → load core frameworks
/<skill_name> <topic> → find and explain a topic
/<skill_name> ch<N> → dive into a specific chapter
Quality Rules
- Extract structure, not summaries — capture named frameworks, exact formulations, anti-patterns; not chapter recaps
- Preserve the author's precision — "The 5 Whys" ≠ "ask why multiple times"; keep exact naming
- Density over completeness — a 1,000-token summary beats a 10,000-token excerpt
- Practitioner voice — write "Use X when Y", not "The book explains X"
- Front-load SKILL.md — compaction keeps the first 5,000 tokens; most important content comes first
- Chapter files are on-demand — they don't count against skill budget until loaded
- Never copy raw book text — always synthesize, summarize, extract signal
- Topic index is critical — it's how Claude navigates to the right chapter file