# Puppeteer — Headless Chrome Automation Library by Google

> Control Chrome and Firefox programmatically with a high-level Node.js API for testing, scraping, and screenshot generation.

## Install

Save the content below to `.claude/skills/` or append to your `CLAUDE.md`:

# Puppeteer — Headless Chrome Automation Library by Google

## Quick Use
```bash
npm install puppeteer
node -e "const p=require('puppeteer');(async()=>{const b=await p.launch();const pg=await b.newPage();await pg.goto('https://example.com');await pg.screenshot({path:'shot.png'});await b.close()})()"
```

## Introduction
Puppeteer is a Node.js library maintained by the Chrome DevTools team that provides a high-level API to control Chrome or Firefox over the DevTools Protocol. It runs in headless mode by default but can be configured to run in full (headed) mode for debugging or visual testing.

## What Puppeteer Does
- Automates browser interactions including navigation, form filling, and clicking
- Generates screenshots and PDFs of web pages
- Crawls single-page applications and generates pre-rendered content
- Runs end-to-end tests in a real browser environment
- Intercepts and modifies network requests for testing and scraping

## Architecture Overview
Puppeteer communicates with a browser instance via the Chrome DevTools Protocol (CDP) or WebDriver BiDi. When you call `puppeteer.launch()`, it spawns a browser process and establishes a WebSocket connection. Each tab is represented as a `Page` object, and all interactions are sent as protocol commands. The library ships with a compatible Chromium binary by default, though you can point it at an existing Chrome or Firefox installation.

## Self-Hosting & Configuration
- Install via `npm install puppeteer` (downloads Chromium) or `npm install puppeteer-core` (BYO browser)
- Set `PUPPETEER_EXECUTABLE_PATH` to use a custom browser binary
- Configure launch options like `--no-sandbox` for containerized environments
- Use `puppeteer.connect()` to attach to a remote browser instance via WebSocket
- Docker images such as `ghcr.io/puppeteer/puppeteer` include all system dependencies

## Key Features
- Full CDP and experimental WebDriver BiDi support
- Built-in request interception for mocking API responses
- Automatic waiting and smart element selectors
- Network throttling and device emulation for mobile testing
- First-class TypeScript definitions

## Comparison with Similar Tools
- **Playwright** — Multi-browser from day one with built-in test runner; Puppeteer is Chrome/Firefox focused
- **Selenium** — Language-agnostic via WebDriver; Puppeteer is Node.js only but closer to the metal
- **Cypress** — Opinionated test framework with time-travel debugging; Puppeteer is a lower-level library
- **Crawlee** — Built on Puppeteer/Playwright with queue management for large-scale scraping

## FAQ
**Q: Does Puppeteer work with Firefox?**
A: Yes. Firefox support via WebDriver BiDi is available as an experimental feature since Puppeteer v21.

**Q: Can I run Puppeteer in Docker?**
A: Yes. Use the official container image or install Chromium system dependencies manually. Pass `--no-sandbox` when running as root.

**Q: How does Puppeteer differ from puppeteer-core?**
A: The `puppeteer` package downloads a compatible browser automatically. `puppeteer-core` skips the download and expects you to provide a browser path.

**Q: Is Puppeteer suitable for production scraping?**
A: It works for moderate workloads. For large-scale crawling, consider Crawlee or a dedicated scraping framework that handles retries and queues.

## Sources
- https://github.com/puppeteer/puppeteer
- https://pptr.dev

---
Source: https://tokrepo.com/en/workflows/puppeteer-headless-chrome-automation-library-google-9a935863
Author: Script Depot