Skills2026年5月1日·1 分钟阅读

Pandoc — Universal Document Format Converter

Pandoc is a universal document converter that reads and writes dozens of markup formats. It converts between Markdown, LaTeX, HTML, DOCX, EPUB, PDF, and many more with a single command.

Agent 就绪

Agent 可直接安装

这个资产可安装;Agent 先选择当前运行时、检查安装计划,再运行匹配命令。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
Pandoc Overview
直接安装命令
npx -y tokrepo@latest install 303b73e0-44f8-11f1-9bc6-00163e2b0d79 --target codex

先 dry-run 确认安装计划,再运行此命令。

Introduction

Pandoc is a command-line tool written in Haskell that converts documents between a wide range of markup and publishing formats. It handles Markdown, reStructuredText, LaTeX, HTML, DOCX, EPUB, PDF, and many others, making it indispensable for technical writers, academics, and documentation pipelines.

What Pandoc Does

  • Converts between 40+ input and output formats with a single binary
  • Parses extended Markdown with footnotes, tables, citations, and math
  • Generates PDF output via LaTeX, Groff, Typst, or wkhtmltopdf
  • Handles citation processing with built-in CSL support
  • Supports custom templates, filters, and Lua scripting for transformations

Architecture Overview

Pandoc reads source documents into an internal abstract syntax tree (AST) that represents the logical structure. Writers then serialize the AST to the target format. Filters (written in Lua or any language via JSON pipes) can transform the AST between reading and writing. This design decouples input parsing from output generation, so adding a new format requires only a new reader or writer.

Self-Hosting & Configuration

  • Install via package managers (apt, brew, choco) or download binaries from GitHub
  • Use --defaults YAML files to store commonly used conversion options
  • Set up custom LaTeX templates for consistent PDF styling across a team
  • Integrate into CI pipelines to auto-generate documentation from Markdown
  • Combine with pandoc-crossref for numbered figures, tables, and equations

Key Features

  • Broad format coverage spanning plain text, office documents, and e-books
  • Citation and bibliography support using BibTeX, BibLaTeX, or CSL JSON
  • Lua filter API for powerful document transformations without external tools
  • Template system for controlling the output structure of every format
  • Self-contained HTML output that embeds images and CSS in a single file

Comparison with Similar Tools

  • MarkItDown — converts files to Markdown only; Pandoc handles dozens of output formats
  • Docutils — reStructuredText focused; Pandoc supports many more input formats
  • LibreOffice CLI — strong with office formats but limited for markup languages
  • Asciidoctor — AsciiDoc ecosystem tool; Pandoc covers more format pairs
  • Typst — a modern typesetting tool; Pandoc can output to Typst as one of many targets

FAQ

Q: Can Pandoc produce high-quality PDFs? A: Yes. It generates PDFs via LaTeX by default, giving you full typographic control. You can also use Typst or wkhtmltopdf as PDF engines.

Q: Does Pandoc handle Microsoft Word files? A: Yes. It reads and writes DOCX natively, including styles, images, and tables.

Q: How do I add citations? A: Use --citeproc with a bibliography file (BibTeX, CSL JSON, or YAML) and cite keys in your Markdown.

Q: Is Pandoc fast enough for large documents? A: Pandoc handles books and theses well. For very large batch jobs, parallelizing across files is straightforward.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产