Convert scanned PDF to Markdown with OCR
Drop an image-only or scanned PDF and get clean, selectable Markdown. Built-in OCR reads Cyrillic and English, rebuilds real tables and keeps formulas – no account, no separate OCR step.
Yes – a scan becomes selectable Markdown
A scanned PDF is just images of pages, so plain copy-paste returns nothing or garbled characters. PDF to Markdown detects image-only pages and runs OCR (optical character recognition) automatically, turning the pictures of text into real, selectable Markdown – headings, lists, tables and all. It works on documents scanned in Cyrillic, English or a mix of both, and you can convert in the browser without signing up.
Convert a scanned PDF in 4 steps
No account needed. OCR runs automatically, or force it when a PDF has a bad text layer.
Open the converter
Install the Chrome extension or open the web app. Both work anonymously.
Add the scanned PDF
Drag in the file, pick it from disk, or paste a direct PDF URL. OCR runs automatically on image-only pages; toggle force OCR when the existing text layer is wrong.
Wait for the job
Status goes queued, processing, ready. OCR is heavier than reading digital text, so scans take longer than native PDFs.
Copy or download
Preview the rendered Markdown and the raw source, then copy it to your clipboard or download a .md file.
Tip: automating bulk scans? Skip the UI and call the REST API or hosted MCP – same OCR, driven from your own code or agent.
More than just plain text
Recognizing characters is the easy part. The converter rebuilds the document structure a scan loses, so the Markdown is usable by people and models alike.
Cyrillic & English
Reads Russian and English scans, including mixed-language pages, into selectable text.
Real tables
Scanned columns become genuine Markdown tables instead of a jumble of misaligned lines.
Formulas kept
Mathematical notation is preserved rather than flattened into garbled characters.
Force OCR
Override a bad or partial text layer and re-read the page images when the embedded text is wrong.
Links & footnotes
Where present, hyperlinks and footnotes carry over as Markdown links instead of being dropped.
Engine choice
Convert with MinerU or Docling, depending on the document and the result you want.
Free tier limits & long scans
Free tier limits
Paid tiers raise every limit and add a longer time budget for heavy scans. Compare plans →
Long or low-quality scans
truncated, instead of an error. Split the file or use a longer paid budget.Converting scans at scale?
The same OCR pipeline is a REST API and a hosted MCP endpoint, with machine-readable discovery so scripts and agents can drive it directly.
Common questions
Can it convert a scanned PDF to Markdown?
Yes. Image-only and scanned PDFs are OCR'd automatically into selectable Markdown – no separate OCR step and no setup. Just drop the file in the extension or web app.
Does the OCR support Cyrillic and Russian?
Yes. It reads Cyrillic and English, including mixed-language documents, and turns the recognized text into Markdown.
The PDF has a bad text layer – can I force OCR?
Yes. Turn on force OCR so the converter re-reads the page images instead of trusting the embedded text, which fixes garbled or missing characters.
Are tables and formulas kept when converting a scan?
Yes. Scanned columns are rebuilt as real Markdown tables instead of jumbled lines, and mathematical notation is preserved rather than flattened.
Why is my result marked truncated?
OCR is slow, so a very long scan can hit the per-document time budget. The converter returns what it processed, flagged as a partial (truncated) result. A paid tier has a longer budget, or you can split the file.
Is it free and private?
Yes. The free tier gives 3 slots, 10 MB files, a 15-minute time budget and 1-hour retention – anonymous in the browser, no card. Files are auto-deleted after the retention window and are never used to train models.