# Goquery — jQuery-Style HTML Parsing for Go
> A Go package that brings jQuery-like syntax for traversing and manipulating HTML documents. Built on top of the net/html tokenizer and cascadia CSS selector library.
## Install
Save in your project root:
# Goquery — jQuery-Style HTML Parsing for Go
## Quick Use
```bash
go get github.com/PuerkitoBio/goquery
```
```go
doc, err := goquery.NewDocumentFromReader(resp.Body)
doc.Find("h2.title").Each(func(i int, s *goquery.Selection) {
fmt.Println(s.Text())
})
```
## Introduction
Goquery provides a jQuery-like API for parsing and querying HTML in Go. It combines Go's net/html parser with the cascadia CSS selector engine, giving developers a familiar, chainable interface for extracting data from web pages without writing manual tree-walking code.
## What Goquery Does
- Parses HTML documents into a traversable node tree
- Supports full CSS3 selector queries via the cascadia library
- Provides chainable methods like Find, Filter, Children, Parents, and Siblings
- Extracts text content, attribute values, and HTML fragments
- Enables DOM manipulation including Add, Remove, and ReplaceWith
## Architecture Overview
Goquery wraps Go's html.Node tree in a Selection type that holds a slice of matched nodes plus a pointer to the root Document. Methods on Selection return new Selection values, enabling jQuery-style chaining. CSS selectors are compiled once by cascadia and cached, keeping repeated queries fast.
## Setup & Configuration
- Requires Go 1.18 or later
- Install with `go get github.com/PuerkitoBio/goquery`
- Create documents from an io.Reader, a string, or an http.Response
- Pair with Go's net/http client for scraping workflows
- Combine with colly or similar crawlers for large-scale extraction
## Key Features
- Full CSS3 selector support including pseudo-classes and attribute selectors
- Positional methods: First, Last, Eq, Slice for index-based access
- Traversal methods mirror jQuery: Next, Prev, Parent, Closest
- Attribute helpers: Attr, AttrOr, HasClass, AddClass, RemoveClass
- Zero external C dependencies, pure Go implementation
## Comparison with Similar Tools
- **Colly** — full scraping framework with request scheduling; goquery handles just the parsing layer
- **htmlquery** — uses XPath instead of CSS selectors for node selection
- **x/net/html** — raw tokenizer with no selector engine; goquery builds on top of it
- **Cascadia** — the CSS selector engine goquery uses internally; lower-level API
## FAQ
**Q: Can goquery execute JavaScript?**
A: No. Goquery parses static HTML only. For JavaScript-rendered pages, use a headless browser like chromedp and pass the rendered HTML to goquery.
**Q: Is goquery safe for concurrent use?**
A: A Document is safe to read concurrently. Mutations require external synchronization.
**Q: How does goquery handle malformed HTML?**
A: It relies on Go's net/html parser, which implements the HTML5 parsing algorithm and handles malformed markup gracefully.
**Q: Can I modify the DOM and serialize back to HTML?**
A: Yes. Use manipulation methods to change the tree, then call goquery.Render or Selection.Html to get the modified HTML string.
## Sources
- https://github.com/PuerkitoBio/goquery
- https://pkg.go.dev/github.com/PuerkitoBio/goquery
---
Source: https://tokrepo.com/en/workflows/asset-9b5f4ad9
Author: AI Open Source