# Cheerio — Fast HTML Parsing with jQuery Syntax for Node.js

> A fast, flexible implementation of jQuery core for server-side HTML parsing, traversal, and manipulation in Node.js.

## Install

Save as a script file and run:

# Cheerio — Fast HTML Parsing with jQuery Syntax for Node.js

## Quick Use
```bash
npm install cheerio
```
```js
import * as cheerio from 'cheerio';
const $ = cheerio.load('<h2 class="title">Hello</h2>');
$('h2.title').text(); // "Hello"
$('h2').addClass('welcome');
```

## Introduction
Cheerio provides a fast, lean implementation of jQuery's core API for the server. It parses HTML and XML documents into a traversable DOM-like structure, letting you select elements with CSS selectors, read attributes, and manipulate the markup without running a browser or headless engine.

## What Cheerio Does
- Parses HTML and XML strings into a traversable tree structure
- Selects elements using CSS selectors compatible with jQuery syntax
- Reads and modifies attributes, text content, and inner HTML
- Traverses the DOM with parent, children, siblings, find, and filter
- Serializes the modified tree back to an HTML string

## Architecture Overview
Cheerio uses htmlparser2 (or parse5 for spec-compliant parsing) to build an in-memory DOM tree from raw HTML. The jQuery-style API wraps this tree with selector-based querying powered by css-select and DOM manipulation methods. Since there is no browser context, no CSS rendering or JavaScript execution occurs, making it lightweight and predictable for scraping and template transformation tasks.

## Installation & Configuration
- Install via npm; works in Node.js 18+ and modern edge runtimes
- Call cheerio.load() with an HTML string to create a root query function
- Pass options to switch between htmlparser2 (fast, lenient) and parse5 (spec-compliant) parsers
- Configure XML mode for parsing XML documents with self-closing tags
- Pair with fetch or axios to download pages before parsing

## Key Features
- Familiar jQuery API reduces learning curve for front-end developers
- Fast parsing without the overhead of a full browser engine
- Works with malformed HTML thanks to htmlparser2's lenient parsing
- Supports both HTML and XML document processing
- Lightweight with no native dependencies or browser requirement

## Comparison with Similar Tools
- **jsdom** — full W3C DOM with script execution but heavier; Cheerio is faster when you only need parsing and selection
- **Puppeteer** — controls a real Chromium browser for JS-rendered pages; Cheerio works on static HTML only
- **htmlparser2** — lower-level streaming parser; Cheerio adds the jQuery traversal and manipulation layer
- **BeautifulSoup (Python)** — similar concept for Python; Cheerio serves the Node.js ecosystem
- **LinkedOM** — faster DOM alternative; Cheerio offers a more familiar jQuery-style API

## FAQ
**Q: Can Cheerio execute JavaScript in the page?**
A: No. Cheerio only parses and manipulates static HTML. For JS-rendered pages use Puppeteer or Playwright.

**Q: Is Cheerio suitable for web scraping?**
A: Yes. Fetch the HTML with an HTTP client and pass it to cheerio.load(). Select data with CSS selectors and extract text or attributes.

**Q: Does Cheerio support streaming HTML parsing?**
A: Cheerio v1+ supports loading from a stream via the cheerio.fromURL() helper or by piping into the parser.

**Q: How does performance compare to jsdom?**
A: Cheerio is typically several times faster than jsdom for parsing and querying because it skips browser emulation.

## Sources
- https://github.com/cheeriojs/cheerio
- https://cheerio.js.org

---
Source: https://tokrepo.com/en/workflows/1c0917a8-3ffb-11f1-9bc6-00163e2b0d79
Author: Script Depot