What is Crawlee?
Crawlee is a Node.js/Python web scraping library that automatically handles proxy rotation, browser fingerprinting, retries, auto-scaling, and data storage.
In one sentence: Crawlee is a web scraping library for Node.js and Python with built-in proxy rotation, anti-detection, and auto-scaling.
Core Features
1. Multiple Crawler Types
HTTP crawlers (fast) and browser crawlers (JS rendering).
2. Anti-Detection
Built-in browser fingerprint randomization and session management.
3. Proxy Rotation
Automatic per-request proxy rotation.
4. Auto-Scaling
Adjusts concurrency based on system resources and target site response.
5. Built-In Storage
Structured datasets, key-value stores, and request queues.
FAQ
Q: How does it compare to Scrapy? A: Crawlee has native browser support, built-in anti-detection, and works in both JS and Python. Scrapy is Python-only and primarily HTTP-based.