pdf

Installation
SKILL.md

PDF Skill — Action Routing

CRITICAL: Decide Your Action FIRST

Before doing anything, classify the user's request and follow the MANDATORY action:

User wants to... MANDATORY action
Convert .md to .pdf / Generate PDF from markdown / 把 md 转成 pdf MUST use bash to run md_to_pdf.py script (see below)
Create a new PDF from scratch Use bash to run a Python script with reportlab
Read/extract text from a PDF Use read_file or bash with pdfplumber
Merge/split/rotate/encrypt PDFs Use bash with pypdf
Extract tables from a PDF Use bash with pdfplumber
Fill a PDF form Read FORMS.md first
OCR / read text from scanned PDF Convert to images → python_repl with vision LLM (see OCR below)

Markdown → PDF Conversion (MOST COMMON)

ALWAYS use the bash tool to run the built-in script. NEVER use read_file for this task.

bash: python <absolute_path_to_skill>/scripts/md_to_pdf.py <input.md> <output.pdf>

The script path will be listed under "Available Scripts" in the prompt. Use that absolute path directly.

Example:

bash: python /path/to/skills/pdf/scripts/md_to_pdf.py /workspace/report.md /workspace/report.pdf

Features: CJK support, headings, lists, code blocks, tables, bold/italic, blockquotes, horizontal rules.

If reportlab is not installed: bash: pip install reportlab


Reading/Extracting from Existing PDFs

Extract text

import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
    for page in pdf.pages:
        print(page.extract_text())

Extract tables

with pdfplumber.open("document.pdf") as pdf:
    for page in pdf.pages:
        for table in page.extract_tables():
            for row in table:
                print(row)

Merge / Split / Rotate

Merge

from pypdf import PdfWriter, PdfReader
writer = PdfWriter()
for f in ["a.pdf", "b.pdf"]:
    for page in PdfReader(f).pages:
        writer.add_page(page)
with open("merged.pdf", "wb") as out:
    writer.write(out)

Split

reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
    w = PdfWriter()
    w.add_page(page)
    with open(f"page_{i+1}.pdf", "wb") as out:
        w.write(out)

Rotate

reader = PdfReader("input.pdf")
writer = PdfWriter()
page = reader.pages[0]
page.rotate(90)
writer.add_page(page)
with open("rotated.pdf", "wb") as out:
    writer.write(out)

Password Protection

from pypdf import PdfReader, PdfWriter
reader = PdfReader("input.pdf")
writer = PdfWriter()
for page in reader.pages:
    writer.add_page(page)
writer.encrypt("userpassword", "ownerpassword")
with open("encrypted.pdf", "wb") as out:
    writer.write(out)

Create PDF from Scratch (reportlab)

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet

doc = SimpleDocTemplate("output.pdf", pagesize=letter)
styles = getSampleStyleSheet()
story = [Paragraph("Title", styles['Title']), Spacer(1, 12), Paragraph("Body text", styles['Normal'])]
doc.build(story)

IMPORTANT: Never use Unicode subscript/superscript characters in ReportLab. Use <sub> and <super> tags instead.

OCR Scanned PDFs (via Vision LLM)

When pdfplumber.extract_text() returns empty or garbled text, the PDF is likely scanned/image-based. Use python_repl with a vision-capable LLM to analyze the converted images:

Step 1: Convert all PDF pages to images:

bash: python <skill_path>/scripts/convert_pdf_to_images.py <input.pdf> <output_dir>

Step 2: Use python_repl to call the vision LLM for OCR on the images:

import base64, os, json

image_dir = "<output_dir>"
images = sorted([f for f in os.listdir(image_dir) if f.lower().endswith(('.png','.jpg','.jpeg'))])

results = []
for img_file in images:
    path = os.path.join(image_dir, img_file)
    with open(path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode()
    # Call your vision-capable LLM here (e.g. via API)
    # Return the extracted text for each image
    results.append({"file": img_file, "text": extracted_text})

print(json.dumps(results, ensure_ascii=False, indent=2))

Images are batched into minimal LLM calls automatically (up to 8 per call).


Quick Reference

Task Tool Method
Markdown → PDF bash + md_to_pdf.py python scripts/md_to_pdf.py in.md out.pdf
Extract text pdfplumber page.extract_text()
Extract tables pdfplumber page.extract_tables()
Merge/split/rotate pypdf PdfReader + PdfWriter
Create from scratch reportlab SimpleDocTemplate
Fill forms see FORMS.md
OCR scanned python_repl + vision LLM Convert pages to images → call vision LLM via python_repl

References

  • FORMS.md — PDF form filling
  • REFERENCE.md — Advanced features, JS libraries, troubleshooting
Related skills

More from memento-teams/memento-skills

Installs
8
GitHub Stars
1.3K
First Seen
Mar 26, 2026