Crawl4AI MCP — Web Crawling Server for AI Agents
MCP server that gives AI agents web crawling superpowers. Crawl4AI MCP enables Claude Code and Cursor to scrape, extract, and process web content through tool calls.
这个资产会安全暂存
这个资产会先安全暂存。复制的指令会要求 Agent 读取暂存文件,并在激活脚本、MCP 配置或全局配置前先确认。
npx -y tokrepo@latest install b3396cdc-13f2-4aa8-930a-0894db1046d7 --target codex先暂存文件;激活前需要读取暂存 README 和安装计划。
What it is
Crawl4AI MCP is a Model Context Protocol server that provides web crawling capabilities to AI agents. It enables Claude Code, Cursor, and other MCP-compatible clients to scrape web pages, extract structured content, and process web data through standardized tool calls. The server handles browser automation, content extraction, and markdown conversion behind the scenes.
Developers building AI agents that need web access, researchers collecting data from websites, and teams automating content extraction workflows use Crawl4AI MCP as the bridge between their AI tools and the web.
How it saves time or tokens
Without an MCP server for crawling, you would need to write custom scraping scripts, handle browser automation, and parse HTML manually. Crawl4AI MCP packages this into tool calls that AI agents invoke directly. The server extracts clean markdown from web pages, reducing the token overhead of processing raw HTML. It also handles JavaScript rendering, which simple HTTP-based scrapers cannot do.
How to use
- Add Crawl4AI MCP to your Claude Code configuration:
{
"mcpServers": {
"crawl4ai": {
"command": "uvx",
"args": ["crawl4ai-mcp"]
}
}
}
- Or install and run directly:
pip install crawl4ai-mcp
crawl4ai-mcp
- In Claude Code, ask the agent to crawl or extract content from any URL. The MCP server handles the rest.
Example
// .claude/mcp.json configuration
{
"mcpServers": {
"crawl4ai": {
"command": "uvx",
"args": ["crawl4ai-mcp"],
"env": {
"CRAWL4AI_BROWSER": "chromium"
}
}
}
}
# In Claude Code session:
> Crawl https://example.com/docs and extract the main content as markdown
> Scrape the pricing table from https://example.com/pricing
> Extract all links from the documentation page
Related on TokRepo
- Web Scraping Tools -- explore web scraping tools and frameworks for data extraction
- MCP Chrome Integration -- discover browser automation through MCP
Common pitfalls
- Some websites block headless browsers. Crawl4AI uses browser automation under the hood, but heavily protected sites may still return empty results or CAPTCHAs.
- The server requires a Chromium binary for JavaScript rendering. Ensure your environment supports headless browser execution (not available in all CI containers).
- Crawling generates significant output. Set appropriate token limits in your MCP client configuration to avoid overwhelming the agent context with large page content.
常见问题
Crawl4AI MCP works with any MCP-compatible client including Claude Code, Cursor, and other tools that support the Model Context Protocol. Configure it in your client MCP settings file and the crawling tools become available to the AI agent.
Yes. The server uses browser automation (Chromium) to render pages, so it captures content generated by JavaScript frameworks like React, Vue, and Angular. This is a significant advantage over simple HTTP-based scrapers that only see the initial HTML.
Crawl4AI MCP extracts the main content from web pages and converts it to clean markdown. It removes navigation, ads, and boilerplate HTML, returning only the relevant content. This reduces token usage when feeding web content to LLMs.
Yes. You can ask the AI agent to crawl multiple URLs, and it will invoke the MCP tools sequentially. The server handles each request independently. For large-scale crawling, consider using Crawl4AI as a Python library directly rather than through MCP.
The server processes requests sequentially by default, which provides natural rate limiting. For responsible crawling, add delays between requests in your agent instructions. The server does not enforce rate limits itself, so respect target site terms of service.
引用来源 (3)
- Crawl4AI GitHub— MCP server for AI agent web crawling and content extraction
- MCP Specification— Model Context Protocol for tool integration with AI agents
- Crawl4AI MCP Docs— Supports Claude Code and Cursor as MCP clients
来源与感谢
Built on crawl4ai — 20k+ stars, Apache 2.0
讨论
相关资产
Apify MCP Server — 8,000+ Web Scrapers for Agents
Apify MCP Server connects agents to Apify Actors via a hosted endpoint (mcp.apify.com) or local run, turning thousands of web scrapers into callable tools.
WhatsApp MCP Server — Chat with WhatsApp via AI Agents
MCP server connecting Claude and AI agents to your personal WhatsApp. Search contacts, read messages, send replies and media via natural language.
Notte — Browser Automation MCP for AI Agents
MCP server that turns web browsers into AI agent tools. Notte provides structured browser actions like click, type, navigate, and extract for LLM-driven automation.
n8n MCP Server — Build Automations with AI, 1,396 Nodes
MCP server giving AI agents access to 1,396 n8n nodes and 2,709 workflow templates. Build and manage n8n automations through natural language.