book-mirror
book-mirror — Personalized Chapter-by-Chapter Book Analysis
Convention: see _brain-filing-rules.md for the sanctioned
media/<format>/<slug>exception this skill files under.Convention: see conventions/quality.md for citation rules, back-link enforcement, and output quality bars.
Convention: see conventions/brain-first.md for the lookup chain (brain → search → external) the context-gathering phase follows.
What this does
Given a book (EPUB or PDF), produce a brain page where every chapter is
summarized in detail on the left and mirrored back to the reader's actual life
on the right, using their own words, situations, people, and patterns from
the brain. Output is a brain page at media/books/<slug>-personalized.md.
This is NOT a generic book summary. The right column is the value: it makes the book read like a therapist who knows the reader is leaving notes in the margins. If the user wants a flat summary instead, route them to a different skill.
Trust contract (read this before running)
book-mirror runs as a CLI command (gbrain book-mirror), NOT as a pure
markdown skill that the agent dispatches via tools. The CLI is the trusted
runtime; the skill is the orchestration prose around it.
What this means for the agent:
- The CLI submits N read-only subagent jobs (one per chapter). Each subagent
has
allowed_tools: ['get_page', 'search']only. They CANNOT call put_page or any mutating op. They produce markdown analysis via their final message. - The CLI reads each child's
job.result, assembles the final two-column page, and writes it via a single operator-trustput_page. - This means untrusted EPUB/PDF content cannot prompt-inject any
people/*page. The trust narrowing happens at the tool allowlist, not at the slug-prefix layer.
The pipeline
1. ACQUIRE → User has the EPUB/PDF locally (manual; book-acquisition is
not currently shipped — see "Acquiring the book" below).
2. EXTRACT → Pull chapter text from EPUB/PDF into one .txt per chapter.
3. CONTEXT → Gather everything the brain knows about the reader.
4. ANALYZE → `gbrain book-mirror` fans out N read-only subagents.
5. ASSEMBLE → CLI reads each child result and writes one put_page.
6. PDF → Optional: render via skills/brain-pdf for delivery.
1. Acquiring the book
book-acquisition (legal-grey-area downloader) was deliberately not shipped in this skill wave. The user drops the EPUB/PDF manually. Common paths the user might use:
# User-supplied path
ls path/to/book.epub
ls path/to/book.pdf
# Or already in the brain repo (recommended for tracking)
ls $BRAIN_DIR/media/books/
Resolve $BRAIN_DIR from the gbrain config (gbrain config get sync.repo_path)
or accept it from the user.
2. Text extraction
Goal: one .txt file per chapter under a temp directory. The agent has
shell + python access; the CLI is downstream of this and takes the
extracted directory as input.
EPUB
SLUG="this-book" # kebab-case
WORK="$(mktemp -d)/$SLUG"
mkdir -p "$WORK/chapters"
unzip -o path/to/book.epub -d "$WORK/unpacked"
# Find content files (XHTML/HTML), sorted (chapter order = sort order)
find "$WORK/unpacked" -name "*.xhtml" -o -name "*.html" | sort > "$WORK/files.txt"
# Strip HTML to text per chapter
python3 - <<'PY'
from bs4 import BeautifulSoup
import os, sys
work = os.environ['WORK']
files = open(f'{work}/files.txt').read().splitlines()
for i, path in enumerate(files, 1):
html = open(path, encoding='utf-8', errors='replace').read()
text = BeautifulSoup(html, 'html.parser').get_text('\n')
text = '\n'.join(line.strip() for line in text.splitlines() if line.strip())
with open(f'{work}/chapters/{i:02d}.txt', 'w') as f:
f.write(text)
PY
If bs4 is missing: pip3 install beautifulsoup4 lxml.
Inspect the chapter files to identify which are real chapters vs front
matter (TOC, copyright, acknowledgments). Often the EPUB ships one file
per chapter; sometimes multiple chapters per file. Use
head -5 "$WORK/chapters/"*.txt to spot-check.
pdftotext -layout path/to/book.pdf "$WORK/full.txt"
Then split by chapter heading (look for "Chapter N", "CHAPTER N", or
all-caps title lines) using awk or python. If the PDF is a scan with
no embedded text, fall back to OCR via skills/brain-pdf or another
vision tool.
Quality check
For each chapter file:
- Word count > 1500 (typical chapter range 2k–8k words).
- No HTML tags.
- Paragraphs preserved with
\n\n.
Save a chapters/INDEX.md mapping chapter number → title → file → word
count for reference.
3. Context gathering
This is the most critical step. The right column is only as good as the context fed to each chapter subagent.
What to pull
- Templates: USER.md and SOUL.md if the user maintains them
(gbrain ships templates at
templates/USER.mdandtemplates/SOUL.md; they live in the brain repo when populated). Read full. - Recent daily memory — last 14 days of brain pages under
wiki/personal/reflections/or wherever the user files daily notes. - Topic-relevant brain searches tuned to the book's themes:
gbrain query "marriage",gbrain query "couples therapy"for a marriage book.gbrain query "founders",gbrain query "fundraising"for a business book.gbrain query "shame",gbrain query "anger"for a psychology book.
- Brain pages for relevant entities —
gbrain query "<name>"for people who will likely come up. - Standing patterns — anything in the user's reflections or originals that's been recurring.
Assemble a context pack
Write everything to a single file the CLI can read:
CONTEXT="$WORK/context.md"
{
echo "## USER.md (if any)"
[ -f "$BRAIN_DIR/USER.md" ] && cat "$BRAIN_DIR/USER.md"
echo
echo "## SOUL.md (if any)"
[ -f "$BRAIN_DIR/SOUL.md" ] && cat "$BRAIN_DIR/SOUL.md"
echo
echo "## Recent reflections (last 14 days)"
# Pull recent daily reflections — adapt to the user's filing scheme
# ...
echo
echo "## Topic-relevant brain pages"
# gbrain query the book's key themes, embed top results
# ...
echo
echo "## Themes & cruxes"
# A 1-page summary, written by the agent, calling out:
# - What's currently active in the user's life that this book intersects
# - Specific quotes from the user that map to book themes
# - People and dates that should appear in the right column
} > "$CONTEXT"
Make this dense. It's read by every chapter subagent.
4. Analysis: invoke gbrain book-mirror
gbrain book-mirror \
--chapters-dir "$WORK/chapters" \
--context-file "$CONTEXT" \
--slug "$SLUG" \
--title "Book Title Goes Here" \
--author "Author Name" \
--model claude-opus-4-7
The CLI:
- Validates inputs and loads chapter files.
- Prints a cost estimate (~$0.30/chapter at Opus) and prompts to confirm.
- Submits N child subagent jobs with read-only
allowed_tools. - Waits for every child to complete.
- Reads each child's
job.result(the markdown analysis text). - Assembles all chapters into one page with frontmatter + intro + per-chapter sections + closing.
- Writes ONE
put_pagetomedia/books/<slug>-personalized.md. - Reports a JSON envelope on stdout:
{"slug": "...", "chapters_total": N, "chapters_completed": N, "chapters_failed": 0}.
If any chapter failed, the CLI exits 1 and the user can re-run — idempotency
keys (book-mirror:<slug>:ch-<N>) deduplicate completed chapters at the
queue level, so retry is cheap.
Model: Opus by default
The default model is claude-opus-4-7. Sonnet works (use --model claude-sonnet-4-6) but the right-column quality drops noticeably — the
texture that makes the analysis read like a therapist who knows the user
needs Opus-grade reasoning.
Cost gate
The CLI refuses to spend in a non-TTY context without --yes. CI / scripted
invocations must pass --yes explicitly. TTY users get a [y/N] prompt
before submission.
5. PDF (optional)
After the brain page is written, render to PDF using skills/brain-pdf:
gbrain put_page # already done by the CLI; nothing to add here
# Then invoke brain-pdf:
# (see skills/brain-pdf/SKILL.md for the make-pdf invocation)
6. Fact-check and cross-link
After the page lands, run a fact-check pass on factual claims about the reader (parents, siblings, marriage history, jobs, heritage). Common error patterns to look for:
- Conflating the reader's parents' relationship with patterns in extended family.
- Inventing therapy backstory ("after his parents' divorce…") when the reader's parents are still together.
- Wrong number/age of children, wrong spouse / kid / sibling names.
If you can't verify a claim, remove it. Better to lose texture than to introduce a falsehood.
Cross-link entities mentioned in the analysis:
- For every person the right column references with a brain page, add a
back-link from
people/<slug>to the newmedia/books/<slug>-personalizedpage (perconventions/quality.mdIron Law).
Quality bar (the bar)
The left column should:
- Preserve the author's actual stories, statistics, frameworks, examples.
- Quote memorable phrases verbatim.
- Be detailed enough that the reader could skip the book and not lose much.
The right column should:
- Use the reader's actual quoted words from the context pack.
- Reference specific dates, situations, people by name.
- Read like a therapist who knows the reader is leaving notes in the margins.
- Be plain about direct hits ("This is exactly the [name a real situation]").
- Be honest about misses ("This chapter is less directly relevant because…"). Don't force connections.
The whole document should feel like one coherent voice, calibrated to the reader's actual life rather than a generic profile, and honest about where the book's framing breaks down for this specific reader.
Anti-patterns (do not do these)
- ❌ Skimming chapters. Standing instruction: preserve detail.
- ❌ Generic right column. "This might apply if you've ever felt…" → kill on sight.
- ❌ Factual errors about the reader's life. Always fact-check after assembly.
- ❌ Giving the subagent put_page access. Trust contract is read-only; the CLI does the writing.
- ❌ Forcing connections. If a chapter doesn't apply, say so plainly.
- ❌ Sycophancy or moralizing in the right column. No "you should…", no "consider…", no "perhaps it's time to…".
- ❌ Truncating the LEFT column. The book's actual content needs to survive.
Output checklist
- Book file exists locally (path known).
- Chapter texts under
$WORK/chapters/*.txtwith sane word counts. - Context pack at
$WORK/context.mdis dense. -
gbrain book-mirror --chapters-dir … --context-file … --slug … --title …returned exit 0. -
media/books/<slug>-personalized.mdexists in the brain. - Fact-check pass complete (no errors against USER.md or other source-of-truth pages).
- Cross-links added from referenced people/companies.
- Optional: PDF rendered via brain-pdf and delivered.
Related skills
skills/brain-pdf/SKILL.md— render the personalized page to PDF.skills/strategic-reading/SKILL.md— read a book through a specific problem-lens instead of personalizing to the whole reader.skills/article-enrichment/SKILL.md— same shape applied to articles rather than books.
Contract
This skill guarantees:
- Routing matches the canonical triggers in the frontmatter.
- Output written under the directories listed in
writes_to:(when applicable). - Conventions referenced (
quality.md,brain-first.md,_brain-filing-rules.md) are followed. - Privacy contract preserved: no real names, no fork-specific filesystem path literals, no upstream-fork references.
The full behavior contract is documented in the body sections above; this section exists for the conformance test.
Output Format
The skill's output shape is documented inline in the body sections above (see "Output", "Brain page format", or equivalent). The literal section header here exists for the conformance test (test/skills-conformance.test.ts).
Anti-Patterns
The full anti-pattern list is in the body sections above; this header exists for the conformance test if the body uses a different casing.