# Pandoc — Universal Document Format Converter > Pandoc is a universal document converter that reads and writes dozens of markup formats. It converts between Markdown, LaTeX, HTML, DOCX, EPUB, PDF, and many more with a single command. ## Install Save as a script file and run: # Pandoc — Universal Document Format Converter ## Quick Use ```bash brew install pandoc # or apt install pandoc pandoc input.md -o output.pdf pandoc input.docx -t markdown -o output.md pandoc README.md -o slides.pptx ``` ## Introduction Pandoc is a command-line tool written in Haskell that converts documents between a wide range of markup and publishing formats. It handles Markdown, reStructuredText, LaTeX, HTML, DOCX, EPUB, PDF, and many others, making it indispensable for technical writers, academics, and documentation pipelines. ## What Pandoc Does - Converts between 40+ input and output formats with a single binary - Parses extended Markdown with footnotes, tables, citations, and math - Generates PDF output via LaTeX, Groff, Typst, or wkhtmltopdf - Handles citation processing with built-in CSL support - Supports custom templates, filters, and Lua scripting for transformations ## Architecture Overview Pandoc reads source documents into an internal abstract syntax tree (AST) that represents the logical structure. Writers then serialize the AST to the target format. Filters (written in Lua or any language via JSON pipes) can transform the AST between reading and writing. This design decouples input parsing from output generation, so adding a new format requires only a new reader or writer. ## Self-Hosting & Configuration - Install via package managers (apt, brew, choco) or download binaries from GitHub - Use `--defaults` YAML files to store commonly used conversion options - Set up custom LaTeX templates for consistent PDF styling across a team - Integrate into CI pipelines to auto-generate documentation from Markdown - Combine with pandoc-crossref for numbered figures, tables, and equations ## Key Features - Broad format coverage spanning plain text, office documents, and e-books - Citation and bibliography support using BibTeX, BibLaTeX, or CSL JSON - Lua filter API for powerful document transformations without external tools - Template system for controlling the output structure of every format - Self-contained HTML output that embeds images and CSS in a single file ## Comparison with Similar Tools - **MarkItDown** — converts files to Markdown only; Pandoc handles dozens of output formats - **Docutils** — reStructuredText focused; Pandoc supports many more input formats - **LibreOffice CLI** — strong with office formats but limited for markup languages - **Asciidoctor** — AsciiDoc ecosystem tool; Pandoc covers more format pairs - **Typst** — a modern typesetting tool; Pandoc can output to Typst as one of many targets ## FAQ **Q: Can Pandoc produce high-quality PDFs?** A: Yes. It generates PDFs via LaTeX by default, giving you full typographic control. You can also use Typst or wkhtmltopdf as PDF engines. **Q: Does Pandoc handle Microsoft Word files?** A: Yes. It reads and writes DOCX natively, including styles, images, and tables. **Q: How do I add citations?** A: Use `--citeproc` with a bibliography file (BibTeX, CSL JSON, or YAML) and cite keys in your Markdown. **Q: Is Pandoc fast enough for large documents?** A: Pandoc handles books and theses well. For very large batch jobs, parallelizing across files is straightforward. ## Sources - https://github.com/jgm/pandoc - https://pandoc.org/MANUAL.html --- Source: https://tokrepo.com/en/workflows/303b73e0-44f8-11f1-9bc6-00163e2b0d79 Author: Script Depot