PDF Processing Guide

Overview

This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see REFERENCE.md. If you need to fill out a PDF form, read FORMS.md and follow its instructions.

CRITICAL: Smart PDF Reading — Avoid Context Overflow

Claude's Read tool converts each PDF page into an image. The API has a hard limit of 100 images per conversation. A 90+ page PDF will fail outright, and even smaller PDFs can consume enormous context budget (each page-image costs far more tokens than equivalent plain text).

This is the #1 cause of failures when processing PDFs. Always think before you read.

Step 0: Probe First, Read Later

For any PDF the user uploads or asks you to read, run the probe script first to understand what you're dealing with:

pdf

PDF Processing Guide

Overview

CRITICAL: Smart PDF Reading — Avoid Context Overflow

Step 0: Probe First, Read Later

More from touricks/fanshi_personal_skills

study-notes-generator

scientific-slides

langgraph

humanizer-zh

docx

ml-paper-writing