# Crawl4AI MCP — Web Crawling Server for AI Agents

> MCP server that gives AI agents web crawling superpowers. Crawl4AI MCP enables Claude Code and Cursor to scrape, extract, and process web content through tool calls.

## Install

Merge the JSON below into your `.mcp.json`:

## Quick Use

```json
// claude_desktop_config.json or .cursor/mcp.json
{
  "mcpServers": {
    "crawl4ai": {
      "command": "uvx",
      "args": ["crawl4ai-mcp"]
    }
  }
}
```

```bash
# Or install and run directly
pip install crawl4ai-mcp
crawl4ai-mcp
```

Now your AI agent can crawl any website:

```
User: "Crawl the Anthropic docs and summarize the API pricing"
Agent: [calls crawl4ai.crawl_url tool] → returns clean markdown → summarizes
```

## What is Crawl4AI MCP?

Crawl4AI MCP is a Model Context Protocol server that wraps the Crawl4AI web scraping library. It gives AI agents (Claude Code, Cursor, Cline) the ability to crawl websites, extract structured data, and process web content — all through standard MCP tool calls. The agent decides when to crawl and what to extract, making it ideal for research, data gathering, and RAG workflows.

**Answer-Ready**: Crawl4AI MCP is an MCP server for AI agent web crawling. Gives Claude Code and Cursor the ability to scrape websites, extract markdown, and gather structured data via tool calls. Handles JavaScript rendering and anti-bot measures. Based on Crawl4AI (20k+ stars).

**Best for**: AI agents that need web access for research or data extraction. **Works with**: Claude Code, Claude Desktop, Cursor, any MCP-compatible client. **Setup time**: Under 2 minutes.

## Core Tools

### 1. crawl_url
Crawl a single page and return clean markdown.

### 2. smart_crawl
Crawl with AI-powered content extraction — automatically identifies main content and removes noise.

### 3. extract_structured
Extract structured data using CSS selectors or AI extraction.

### 4. batch_crawl
Crawl multiple URLs in parallel.

## Configuration Options

```json
{
  "mcpServers": {
    "crawl4ai": {
      "command": "uvx",
      "args": ["crawl4ai-mcp"],
      "env": {
        "CRAWL4AI_BROWSER": "chromium",
        "CRAWL4AI_HEADLESS": "true",
        "CRAWL4AI_MAX_CONCURRENT": "5"
      }
    }
  }
}
```

## Use Cases

| Use Case | How |
|----------|-----|
| Research Assistant | Agent crawls sources, synthesizes findings |
| Competitive Analysis | Crawl competitor sites, extract pricing |
| Documentation Q&A | Crawl docs site, answer questions |
| Content Aggregation | Batch crawl RSS feeds, summarize |
| Lead Generation | Extract contact info from business pages |

## Crawl4AI MCP vs Other Web MCPs

| Feature | Crawl4AI MCP | Puppeteer MCP | Firecrawl MCP |
|---------|-------------|---------------|---------------|
| JS Rendering | Yes | Yes | Yes |
| AI Extraction | Built-in | No | API-based |
| Batch Crawl | Yes | No | Yes |
| Cost | Free (local) | Free (local) | API pricing |
| Anti-bot | Good | Basic | Excellent |
| Speed | Fast | Moderate | Fast |

## FAQ

**Q: Does it handle JavaScript-heavy sites?**
A: Yes, uses Playwright/Chromium for full JavaScript rendering before extraction.

**Q: How is it different from the Puppeteer MCP?**
A: Crawl4AI MCP is optimized for content extraction (clean markdown, structured data). Puppeteer MCP is lower-level (screenshots, DOM manipulation, form filling).

**Q: Can I use it for full site crawling?**
A: Yes, the batch_crawl tool supports crawling multiple pages. For full site discovery, combine with sitemap parsing.

## Source & Thanks

> Built on [Crawl4AI](https://github.com/unclecode/crawl4ai) by unclecode. Licensed under Apache 2.0.
>
> [crawl4ai-mcp](https://pypi.org/project/crawl4ai-mcp/) — MCP wrapper for Crawl4AI (20k+ stars)

<!-- ZH -->

## 快速使用

```json
{"mcpServers": {"crawl4ai": {"command": "uvx", "args": ["crawl4ai-mcp"]}}}
```

一行配置让 AI Agent 获得网页爬取能力。

## 什么是 Crawl4AI MCP？

Crawl4AI MCP 是基于 MCP 协议的网页爬取服务器，让 Claude Code、Cursor 等 AI Agent 通过工具调用抓取和提取网页内容。

**一句话总结**：MCP 网页爬取服务器，AI Agent 通过工具调用抓取网页、提取 Markdown 和结构化数据，支持 JS 渲染，基于 Crawl4AI（20k+ stars）。

**适合人群**：需要网页访问能力的 AI Agent 用户。

## 核心工具

### 1. crawl_url — 单页抓取
### 2. smart_crawl — AI 智能提取
### 3. batch_crawl — 批量爬取

## 常见问题

**Q: 支持 JS 渲染？**
A: 支持，使用 Playwright/Chromium。

**Q: 和 Puppeteer MCP 区别？**
A: Crawl4AI MCP 专注内容提取，Puppeteer MCP 偏底层 DOM 操作。

## 来源与致谢

> 基于 [crawl4ai](https://github.com/unclecode/crawl4ai) — 20k+ stars, Apache 2.0

---
Source: https://tokrepo.com/en/workflows/b3396cdc-13f2-4aa8-930a-0894db1046d7
Author: MCP Hub