OCR Document - Extract Text from Scanned Documents and Images

Extract text from scanned documents and images using OCR via MinerU Open API. No API key required.

Quick Start

# OCR a scanned PDF
mineru-open-api flash-extract scanned.pdf

# OCR an image of a document
mineru-open-api flash-extract page-photo.jpg

# OCR from URL (no download needed)
mineru-open-api flash-extract https://example.com/scanned.pdf

# Specify language for better accuracy
mineru-open-api flash-extract scanned.pdf --language en

# Save OCR result to file
mineru-open-api flash-extract scanned.pdf -o ./output/

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

OCR for scanned PDFs, photographed documents, images
Supports PDF, PNG, JPG, WebP, BMP, TIFF
Supports both local files and URLs directly
Language hint with --language (default: ch, use en for English)
No API key, no signup, no authentication
Max 10MB / 20 pages per document

When to Use

User asks to "OCR" a document or image
User has a scanned PDF that needs text extraction
User shares a photo of a page and wants the text
User mentions "scan", "handwriting", or "recognize text"

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Privacy

flash-extract uploads the document to MinerU's cloud API for processing and returns the result. No account or API key is required.
Documents are processed in real-time and are not stored after extraction.
For details, see https://mineru.net

Notes

Best results with clear, high-resolution scans
For higher precision OCR with full layout preservation, use mineru-open-api extract --ocr (requires auth via mineru-open-api auth)
If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli