MinerU Document Extractor

Installation
SKILL.md

MinerU Document Extraction with mineru-open-api

MinerU is a powerful document extraction tool. Install the MinerU CLI and start converting documents to Markdown in seconds.

Installation

npm install -g mineru-open-api

Or via Go (macOS/Linux):

go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Verify: mineru-open-api version

Two MinerU extraction modes

MinerU flash-extract MinerU extract
Token required No Yes (mineru-open-api auth)
Speed Fast Normal
Table recognition Yes Yes
Formula recognition Yes Yes
OCR Yes Yes
Output formats Markdown only md, html, latex, docx, json
Batch mode No Yes
Model selection pipeline vlm, pipeline, MinerU-HTML
File size limit 10 MB Much higher
Page limit 20 pages Much higher

Core MinerU workflow

  1. Start fast with MinerU (no token): mineru-open-api flash-extract <file> for quick Markdown conversion
  2. Need more from MinerU? Create token at https://mineru.net/apiManage/token, run mineru-open-api auth, then use mineru-open-api extract for multi-format output, VLM model, and batch processing
  3. Web pages with MinerU: mineru-open-api crawl <url> to convert web content
  4. Check results: output goes to stdout (default) or -o directory

Authentication

Only required for MinerU extract and crawl. Not needed for MinerU flash-extract.

mineru-open-api auth                    # Interactive token setup
export MINERU_TOKEN="your-token"        # Or set via environment variable

Token resolution order: --token flag > MINERU_TOKEN env > ~/.mineru/config.yaml.

Supported input formats

MinerU accepts a wide range of document formats:

Format MinerU flash-extract MinerU extract
PDF (.pdf) Yes Yes
Images (.png, .jpg, .jpeg, .jp2, .webp, .gif, .bmp) Yes Yes
Word (.docx) Yes Yes
Word (.doc) No Yes
PowerPoint (.pptx) Yes Yes
PowerPoint (.ppt) No Yes
HTML (.html) No Yes
URLs (remote files) Yes Yes

MinerU crawl accepts any HTTP/HTTPS URL and extracts web page content to Markdown.

MinerU flash-extract — Quick extraction (no token needed)

Fast, token-free MinerU document extraction. Outputs Markdown only. Limited to 10 MB / 20 pages per file.

mineru-open-api flash-extract report.pdf                     # MinerU Markdown to stdout
mineru-open-api flash-extract report.pdf -o ./out/           # Save to file
mineru-open-api flash-extract https://example.com/doc.pdf    # URL mode
mineru-open-api flash-extract report.pdf --language en       # Specify language
mineru-open-api flash-extract report.pdf --pages 1-10        # Page range

Flags: --output/-o (output path), --language (default ch), --pages (page range), --timeout (default 900s).

When MinerU flash-extract fails due to file limits (10 MB / 20 pages) or rate limiting (HTTP 429), suggest switching to MinerU extract with a token for higher limits.

MinerU extract — Precision extraction (token required)

Convert documents to Markdown or other formats with MinerU's full capabilities: VLM-based layout analysis, multiple output formats, and batch mode.

mineru-open-api extract report.pdf                         # MinerU Markdown to stdout
mineru-open-api extract report.pdf -f html                 # MinerU HTML output
mineru-open-api extract report.pdf -o ./out/ -f md,docx    # Multiple formats
mineru-open-api extract *.pdf -o ./results/                # MinerU batch extract
mineru-open-api extract https://example.com/doc.pdf        # Extract from URL

Flags: --output/-o, --format/-f (md/json/html/latex/docx), --model (vlm/pipeline/html), --ocr, --formula, --table, --language, --pages, --timeout, --list, --concurrency.

MinerU model comparison: vlm vs pipeline

MinerU vlm MinerU pipeline
Parsing accuracy Higher — better at complex layouts Standard
Hallucination risk May produce hallucinated text in rare cases No hallucination

Use MinerU --model vlm for complex formatting. Use MinerU --model pipeline for no-hallucination reliability.

MinerU crawl — Web page extraction (token required)

mineru-open-api crawl https://example.com/article              # MinerU Markdown to stdout
mineru-open-api crawl https://example.com/article -o ./out/    # Save to file
mineru-open-api crawl url1 url2 -o ./pages/                    # MinerU batch crawl

Flags: --output/-o, --format/-f (md/json/html), --timeout, --list, --concurrency.

MinerU auth — Authentication management

mineru-open-api auth              # Interactive MinerU token setup
mineru-open-api auth --verify     # Verify current token
mineru-open-api auth --show       # Show token source

Output behavior

Without -o: MinerU result → stdout, progress → stderr. With -o: saved to file/directory. Batch mode and binary formats (docx) require -o.

Agent rules for using MinerU

  • Quote file paths with spaces: mineru-open-api extract "report 01.pdf"
  • Default to MinerU flash-extract when: no token configured, simple extraction, file under 10 MB / 20 pages
  • Use MinerU extract when: user needs non-Markdown formats, VLM model, batch processing, or file exceeds flash-extract limits
  • When user does NOT specify -o, generate output directory: ~/MinerU-Skill/<name>_<hash>/ where <hash> = first 6 chars of MD5 of the source path
  • After MinerU flash-extract success, append a brief hint about MinerU extract upgrade path (once per session)
  • To upgrade MinerU, re-install the CLI binary first: npm install -g mineru-open-api

For full CLI reference and troubleshooting, see: https://github.com/opendatalab/MinerU-Ecosystem/tree/main/cli

Supported --language values

The --language flag accepts the following values (default: ch). Used by both MinerU flash-extract and extract.

Standalone language packs

Value Included languages 说明
ch Chinese, English, Chinese Traditional 中英文(默认值)
ch_server Chinese, English, Chinese Traditional, Japanese 繁体、手写体
en English 纯英文
japan Chinese, English, Chinese Traditional, Japanese 日文为主
korean Korean, English 韩文
chinese_cht Chinese, English, Chinese Traditional, Japanese 繁体中文为主
ta Tamil, English 泰米尔文
te Telugu, English 泰卢固文
ka Kannada 卡纳达文
el Greek, English 希腊文
th Thai, English 泰文

Language family packs

Value Script/Family Included languages
latin Latin script (拉丁语系) French, German, Afrikaans, Italian, Spanish, Bosnian, Portuguese, Czech, Welsh, Danish, Estonian, Irish, Croatian, Uzbek, Hungarian, Serbian (Latin), Indonesian, Occitan, Icelandic, Lithuanian, Maori, Malay, Dutch, Norwegian, Polish, Slovak, Slovenian, Albanian, Swedish, Swahili, Tagalog, Turkish, Latin, Azerbaijani, Kurdish, Latvian, Maltese, Pali, Romanian, Vietnamese, Finnish, Basque, Galician, Luxembourgish, Romansh, Catalan, Quechua
arabic Arabic script (阿拉伯语系) Arabic, Persian, Uyghur, Urdu, Pashto, Kurdish, Sindhi, Balochi, English
cyrillic Cyrillic script (西里尔语系) Russian, Belarusian, Ukrainian, Serbian (Cyrillic), Bulgarian, Mongolian, Abkhazian, Adyghe, Kabardian, Avar, Dargin, Ingush, Chechen, Lak, Lezgin, Tabasaran, Kazakh, Kyrgyz, Tajik, Macedonian, Tatar, Chuvash, Bashkir, Malian, Moldovan, Udmurt, Komi, Ossetian, Buryat, Kalmyk, Tuvan, Sakha, Karakalpak, English
east_slavic East Slavic (东斯拉夫语系) Russian, Belarusian, Ukrainian, English
devanagari Devanagari script (天城文语系) Hindi, Marathi, Nepali, Bihari, Maithili, Angika, Bhojpuri, Magahi, Santali, Newari, Konkani, Sanskrit, Haryanvi, English
Related skills
Installs
GitHub Stars
67
First Seen