PDF Skill
When to use
- Read or review PDF content where layout and visuals matter.
- Create PDFs programmatically with reliable formatting.
- Validate final rendering before delivery.
Workflow
- Prefer visual review: render PDF pages to PNGs and inspect them.
- Use
pdftoppmif available. - If unavailable, install Poppler or ask the user to review the output locally.
- Use
- Use
reportlabto generate PDFs when creating new documents. - Use
pdfplumber(orpypdf) for text extraction and quick checks; do not rely on it for layout fidelity. - After each meaningful update, re-render pages and verify alignment, spacing, and legibility.
Temp and output conventions
- Use
tmp/pdfs/for intermediate files; delete when done. - Write final artifacts under
output/pdf/when working in this repo. - Keep filenames stable and descriptive.
Dependencies (install if missing)
Prefer uv for dependency management.
Python packages:
uv pip install reportlab pdfplumber pypdf
If uv is unavailable:
python3 -m pip install reportlab pdfplumber pypdf
System tools (for rendering):
# macOS (Homebrew)
brew install poppler
# Ubuntu/Debian
sudo apt-get install -y poppler-utils
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
Environment
No required environment variables.
Rendering command
pdftoppm -png $INPUT_PDF $OUTPUT_PREFIX
Quality expectations
- Maintain polished visual design: consistent typography, spacing, margins, and section hierarchy.
- Avoid rendering issues: clipped text, overlapping elements, broken tables, black squares, or unreadable glyphs.
- Charts, tables, and images must be sharp, aligned, and clearly labeled.
- Use ASCII hyphens only. Avoid U+2011 (non-breaking hyphen) and other Unicode dashes.
- Citations and references must be human-readable; never leave tool tokens or placeholder strings.
Final checks
- Do not deliver until the latest PNG inspection shows zero visual or formatting defects.
- Confirm headers/footers, page numbering, and section transitions look polished.
- Keep intermediate files organized or remove them after final approval.
More from firecrawl/skills
firecrawl-build-scrape
Integrate Firecrawl `/scrape` into product code for single-page extraction. Use when an app already has a URL and needs markdown, HTML, links, screenshots, metadata, or structured page output. Prefer this skill over broader crawl patterns when the feature is page-level.
20.0Kfirecrawl-build-search
Integrate Firecrawl `/search` into product code and agent workflows. Use when an app needs discovery before extraction, when the feature starts with a query instead of a URL, or when the system should search the web and optionally hydrate result content.
20.0Kfirecrawl-build-interact
Integrate Firecrawl `/interact` into product code for dynamic pages and browser actions after scraping. Use when a feature needs clicks, form fills, pagination, authentication-aware flows, or other multi-step interactions that plain `/scrape` cannot complete.
19.9Kfirecrawl-build-onboarding
Get Firecrawl credentials and SDK setup into a project. Use when an application needs `FIRECRAWL_API_KEY`, when an agent should add Firecrawl to `.env`, when the user wants to authenticate Firecrawl for app code, or when choosing the first SDK and docs for a new Firecrawl integration. This skill includes its own browser auth flow, so it does not depend on the website onboarding skill.
19.9Kfirecrawl-build-map
Integrate Firecrawl `/map` into product code for URL discovery on a known site. Use when a feature needs to find pages before scraping or crawling, especially on large docs sites, blogs, or help centers where the exact target URLs are not known yet.
554firecrawl-build-crawl
Integrate Firecrawl `/crawl` into product code for bulk extraction across a site or site section. Use when a feature needs many related pages, such as documentation sets, help centers, or blogs, and page-by-page `/scrape` would be too manual.
554