book-sft-pipeline

Installation
SKILL.md

Book SFT Pipeline

A complete system for converting books into SFT datasets and training style-transfer models. This skill teaches the pipeline from raw ePub to a model that writes in any author's voice.

When to Activate

Activate this skill when:

  • Building fine-tuning datasets from literary works
  • Creating author-voice or style-transfer models
  • Preparing training data for Tinker or similar SFT platforms
  • Designing text segmentation pipelines for long-form content
  • Training small models (8B or less) on limited data

Core Concepts

The Three Pillars of Book SFT

1. Intelligent Segmentation Text chunks must be semantically coherent. Breaking mid-sentence teaches the model to produce fragmented output. Target: 150-400 words per chunk, always at natural boundaries.

Related skills
Installs
1
GitHub Stars
158
First Seen
Apr 14, 2026