Scripts2026年3月31日·1 分钟阅读

ScrapeGraphAI — AI-Powered Web Scraping

Python scraping library powered by LLMs. Describe what you want to extract in natural language, get structured data back. Handles dynamic pages. 23K+ stars.

介绍

ScrapeGraphAI is a Python web scraping library that uses LLMs to extract structured data from websites. Instead of writing CSS selectors or XPath, describe what you want in natural language. It handles dynamic JavaScript pages (via Playwright), follows pagination, and returns clean structured data. Works with OpenAI, Anthropic, Google, and local models via Ollama. 23,000+ GitHub stars, MIT licensed.

Best for: Developers who need structured data extraction from websites without writing scrapers Works with: OpenAI, Anthropic, Google, Ollama, Groq, any LLM


Key Features

Natural Language Extraction

Describe what you want — the LLM figures out how to extract it:

prompt = "Get all product names, prices, and ratings from this page"

Multiple Graph Types

Graph Use Case
SmartScraperGraph Single page extraction
SearchGraph Search + extract from results
SpeechGraph Extract + convert to audio
ScriptCreatorGraph Generate reusable scraper code
SmartScraperMultiGraph Multi-page extraction

Dynamic Pages

Built-in Playwright support for JavaScript-rendered content. Handles SPAs, infinite scroll, and AJAX.

Structured Output

Returns clean JSON/dict matching your prompt. No post-processing needed.

Local Models

Run entirely offline with Ollama — no data sent to cloud APIs.


FAQ

Q: What is ScrapeGraphAI? A: An AI-powered Python scraping library. Describe what you want to extract in natural language, get structured data back. Handles dynamic JS pages. 23K+ stars.

Q: Is it legal to scrape websites with ScrapeGraphAI? A: ScrapeGraphAI is a tool — legality depends on the target site's terms of service and your jurisdiction. Always respect robots.txt and rate limits.


🙏

来源与感谢

Created by ScrapeGraphAI. Licensed under MIT. ScrapeGraphAI/Scrapegraph-ai — 23,000+ GitHub stars

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产