Introduction
Turndown converts HTML into clean, readable Markdown. It traverses the DOM tree (or parses an HTML string) and maps each element to its Markdown equivalent. This is useful when migrating content from rich-text CMS platforms, processing clipboard paste events, or building HTML-to-Markdown pipelines for documentation workflows.
What Turndown Does
- Converts HTML strings or DOM nodes into CommonMark-compliant Markdown
- Handles headings, lists, links, images, code blocks, blockquotes, and tables
- Supports custom rules to override or extend the default conversion behavior
- Preserves content structure while stripping unnecessary markup
- Works in the browser and Node.js (using jsdom for server-side DOM parsing)
Architecture Overview
Turndown walks the HTML DOM tree depth-first. Each node is matched against an ordered set of rules, where each rule defines a filter (tag name, class, or function) and a replacement function that returns the Markdown string for that node. Built-in rules cover standard HTML elements. Developers can add, remove, or override rules to customize output. The library handles whitespace collapsing, escaping of Markdown special characters, and blank-line insertion to produce clean, readable output.
Setup & Configuration
- Install from npm or load via CDN for browser use
- Create an instance with
new TurndownService(options) - Configure heading style (
atxorsetext), bullet list marker, code block style, and link style - Add custom rules with
turndownService.addRule(name, { filter, replacement }) - Use the turndown-plugin-gfm plugin for GitHub Flavored Markdown tables and strikethrough
Key Features
- Produces clean, human-readable Markdown from complex HTML
- Extensible rule system for custom element handling
- GFM plugin adds table, strikethrough, and task list support
- Escapes Markdown special characters to prevent unintended formatting
- Small bundle size with no heavy dependencies
Comparison with Similar Tools
- rehype-remark — part of the unified ecosystem; more modular but requires assembling a plugin pipeline; Turndown is simpler for one-off conversions
- html-to-markdown (Go) — server-side Go library; Turndown runs in JavaScript on both client and server
- Remark — processes Markdown ASTs; Turndown converts HTML to Markdown, a different direction
- Showdown — converts Markdown to HTML (the opposite direction); pair with Turndown for round-tripping
FAQ
Q: Does Turndown handle tables? A: The core library converts tables to plain text. Install the turndown-plugin-gfm plugin to get proper GFM-style pipe tables.
Q: Can I ignore certain HTML elements during conversion?
A: Yes. Add a rule with a filter matching those elements and return an empty string from the replacement function, or use the remove option.
Q: Does Turndown work in Node.js? A: Yes. In Node.js, it uses jsdom to parse HTML strings into a DOM tree before converting. Install jsdom as a peer dependency.
Q: How does Turndown handle inline styles and classes? A: By default, Turndown ignores CSS classes and inline styles, converting only the semantic HTML structure. Custom rules can inspect attributes if needed.