Pydoll Features & Architecture
Why Not Selenium/Playwright?
| Feature | Selenium | Playwright | Pydoll |
|---|---|---|---|
| WebDriver needed | Yes | Yes | No (CDP direct) |
| Bot detection | Easily detected | Detectable | Hard to detect |
| CAPTCHA solving | External service | External service | Built-in |
| Async support | Limited | Yes | Full asyncio |
| Setup complexity | Driver versioning | Auto-install | Zero — uses system Chrome |
Chrome DevTools Protocol (CDP)
Pydoll connects to Chrome's built-in debugging interface:
# No driver download, no version matching
# Just uses your installed Chrome
async with Chrome() as browser:
page = await browser.start()
# You're controlling a real Chrome instanceBenefits:
- No WebDriver fingerprint for bot detectors to find
- Access to network interception, console logs, performance metrics
- Full control over cookies, storage, headers
- Works with any installed Chromium-based browser
Built-in CAPTCHA Solving
Handle reCAPTCHA and Turnstile without third-party services:
from pydoll.browser import Chrome
async with Chrome() as browser:
page = await browser.start()
await page.go_to("https://site-with-captcha.com")
# Pydoll automatically detects and solves CAPTCHAs
# reCAPTCHA v3 — solved via token injection
# Cloudflare Turnstile — solved via interaction simulationConcurrent Sessions
Run multiple browser instances in parallel:
import asyncio
from pydoll.browser import Chrome
async def scrape_url(url):
async with Chrome() as browser:
page = await browser.start()
await page.go_to(url)
return await page.get_content()
async def main():
urls = ["https://site1.com", "https://site2.com", "https://site3.com"]
results = await asyncio.gather(*[scrape_url(u) for u in urls])
for r in results:
print(len(r), "chars")
asyncio.run(main())Network Interception
Monitor and modify network requests:
async with Chrome() as browser:
page = await browser.start()
# Listen to network events
async def on_response(event):
url = event["params"]["response"]["url"]
if "api/data" in url:
body = await page.get_response_body(event["params"]["requestId"])
print(f"API response: {body[:200]}")
await page.enable_network()
page.on("Network.responseReceived", on_response)
await page.go_to("https://example.com")Anti-Detection Features
- No
navigator.webdriverflag - No ChromeDriver process in task manager
- Realistic mouse movements and typing delays
- Proper viewport and screen resolution emulation
- Cookie and localStorage persistence across sessions
FAQ
Q: What is Pydoll? A: Pydoll is a Python library with 6,700+ GitHub stars for browser automation via Chrome DevTools Protocol (CDP), without WebDriver. It features built-in CAPTCHA solving, anti-detection, and full async support.
Q: How is Pydoll different from Playwright or Selenium? A: Pydoll skips WebDriver entirely, connecting directly to Chrome via CDP. This eliminates the WebDriver fingerprint that bot detectors look for. It also includes built-in CAPTCHA solving (reCAPTCHA v3, Cloudflare Turnstile) that Playwright and Selenium require external services for.
Q: Is Pydoll free? A: Yes, fully open-source under MIT license.