# Marker — Convert PDF to Markdown with High Accuracy > Fast, accurate PDF to Markdown + JSON converter. Handles tables, images, equations, code blocks, and multi-column layouts. GPU-accelerated. 33K+ GitHub stars. ## Install Save as a script file and run: ## Quick Use ```bash pip install marker-pdf # Convert a single PDF marker_single input.pdf output/ --output_format markdown ``` Or use in Python: ```python from marker.converters.pdf import PdfConverter converter = PdfConverter() result = converter("report.pdf") print(result.markdown) ``` --- ## Intro Marker converts PDF files to Markdown and JSON with high accuracy and speed. It correctly handles complex layouts including tables, images, equations, code blocks, multi-column text, headers/footers, and footnotes. GPU-accelerated for fast batch processing. Built on the Surya OCR engine for multi-language support. 33,000+ GitHub stars. **Best for**: RAG pipelines, document ingestion, PDF data extraction, knowledge base building **Works with**: Any LLM pipeline — LangChain, LlamaIndex, Haystack, custom RAG systems --- ## Key Features ### Accurate Conversion - **Tables** — Preserved as Markdown tables with alignment - **Images** — Extracted and saved as separate files - **Equations** — Converted to LaTeX notation - **Code blocks** — Detected and formatted with syntax highlighting - **Multi-column** — Correctly reads multi-column layouts in order - **Headers/footers** — Automatically removed ### Performance - **GPU-accelerated** — 10x faster with CUDA - **Batch processing** — Convert entire directories - **Multi-language** — 90+ languages via Surya OCR engine ### Output Formats - Markdown (clean, LLM-ready) - JSON (structured with metadata) - HTML ### Comparison | Feature | Marker | PyPDF | pdfplumber | |---------|--------|-------|------------| | Tables | ✅ | ❌ | ✅ | | Images | ✅ | ❌ | ❌ | | Equations | ✅ | ❌ | ❌ | | Multi-column | ✅ | ❌ | ❌ | | OCR (scanned) | ✅ | ❌ | ❌ | | Speed (GPU) | Fast | Fast | Medium | --- ### FAQ **Q: What is Marker?** A: A fast, accurate PDF to Markdown converter that handles tables, images, equations, code blocks, and multi-column layouts. GPU-accelerated with 90+ language support. 33K+ GitHub stars. **Q: Can Marker handle scanned PDFs?** A: Yes, it includes OCR via the Surya engine, supporting 90+ languages for both native and scanned PDFs. --- ## Source & Thanks > Created by [Datalab](https://github.com/datalab-to). Licensed under GPL-3.0. > [datalab-to/marker](https://github.com/datalab-to/marker) — 33,000+ GitHub stars --- Source: https://tokrepo.com/en/workflows/42976daf-a56a-4152-9afb-d5b00d130a08 Author: Script Depot