How do I install Cheerio — Fast HTML Parsing with jQuery Syntax for Node.js?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cheerio — Fast HTML Parsing with jQuery Syntax for Node.js

Introduction

Cheerio provides a fast, lean implementation of jQuery's core API for the server. It parses HTML and XML documents into a traversable DOM-like structure, letting you select elements with CSS selectors, read attributes, and manipulate the markup without running a browser or headless engine.

What Cheerio Does

Parses HTML and XML strings into a traversable tree structure
Selects elements using CSS selectors compatible with jQuery syntax
Reads and modifies attributes, text content, and inner HTML
Traverses the DOM with parent, children, siblings, find, and filter
Serializes the modified tree back to an HTML string

Architecture Overview

Cheerio uses htmlparser2 (or parse5 for spec-compliant parsing) to build an in-memory DOM tree from raw HTML. The jQuery-style API wraps this tree with selector-based querying powered by css-select and DOM manipulation methods. Since there is no browser context, no CSS rendering or JavaScript execution occurs, making it lightweight and predictable for scraping and template transformation tasks.

Installation & Configuration

Install via npm; works in Node.js 18+ and modern edge runtimes
Call cheerio.load() with an HTML string to create a root query function
Pass options to switch between htmlparser2 (fast, lenient) and parse5 (spec-compliant) parsers
Configure XML mode for parsing XML documents with self-closing tags
Pair with fetch or axios to download pages before parsing

Key Features

Familiar jQuery API reduces learning curve for front-end developers
Fast parsing without the overhead of a full browser engine
Works with malformed HTML thanks to htmlparser2's lenient parsing
Supports both HTML and XML document processing
Lightweight with no native dependencies or browser requirement

Comparison with Similar Tools

jsdom — full W3C DOM with script execution but heavier; Cheerio is faster when you only need parsing and selection
Puppeteer — controls a real Chromium browser for JS-rendered pages; Cheerio works on static HTML only
htmlparser2 — lower-level streaming parser; Cheerio adds the jQuery traversal and manipulation layer
BeautifulSoup (Python) — similar concept for Python; Cheerio serves the Node.js ecosystem
LinkedOM — faster DOM alternative; Cheerio offers a more familiar jQuery-style API

FAQ

Q: Can Cheerio execute JavaScript in the page? A: No. Cheerio only parses and manipulates static HTML. For JS-rendered pages use Puppeteer or Playwright.

Q: Is Cheerio suitable for web scraping? A: Yes. Fetch the HTML with an HTTP client and pass it to cheerio.load(). Select data with CSS selectors and extract text or attributes.

Q: Does Cheerio support streaming HTML parsing? A: Cheerio v1+ supports loading from a stream via the cheerio.fromURL() helper or by piping into the parser.

Q: How does performance compare to jsdom? A: Cheerio is typically several times faster than jsdom for parsing and querying because it skips browser emulation.

Cheerio — Fast HTML Parsing with jQuery Syntax for Node.js

Introduction

What Cheerio Does

Architecture Overview

Installation & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Fil de discussion

Actifs similaires

Mongoose — Elegant MongoDB Object Modeling for Node.js

Formik — Build React Forms Without the Tears

Leaflet — Lightweight Interactive Maps for the Web