# Firecrawl — Web Scraping API for AI Applications > Turn any website into clean markdown or structured data for LLMs. Firecrawl handles JavaScript rendering, anti-bot bypassing, sitemaps, and batch crawling via simple API. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: ## Quick Use ```bash pip install firecrawl-py ``` ```python from firecrawl import FirecrawlApp app = FirecrawlApp(api_key="fc-...") # Scrape a single page result = app.scrape_url("https://docs.anthropic.com", params={"formats": ["markdown"]}) print(result["markdown"]) # Crawl entire site crawl = app.crawl_url("https://docs.anthropic.com", params={"limit": 100}) for page in crawl["data"]: print(page["markdown"][:200]) ``` ## What is Firecrawl? Firecrawl is a web scraping API designed for AI applications. It converts any website into clean markdown or structured data that LLMs can consume. It handles JavaScript rendering, anti-bot detection, rate limiting, and sitemap discovery — so you can focus on building your AI pipeline. **Answer-Ready**: Firecrawl is a web scraping API that converts websites into clean markdown or structured data for LLMs. Handles JavaScript rendering, anti-bot bypassing, and batch crawling. Used by major AI companies for RAG and training data. 30k+ GitHub stars. **Best for**: AI teams building RAG pipelines or data extraction workflows. **Works with**: Any LLM framework, LangChain, LlamaIndex, Claude Code. **Setup time**: Under 2 minutes. ## Core Features ### 1. Single Page Scrape ```python result = app.scrape_url("https://example.com", params={ "formats": ["markdown", "html", "links"], "onlyMainContent": True, # Strip nav, footer, ads }) ``` ### 2. Full Site Crawl ```python crawl = app.crawl_url("https://docs.example.com", params={ "limit": 500, # Max pages "maxDepth": 3, # Link depth "includePaths": ["/docs/*"], "excludePaths": ["/blog/*"], }) ``` ### 3. Structured Extraction ```python result = app.scrape_url("https://example.com/pricing", params={ "formats": ["extract"], "extract": { "schema": { "type": "object", "properties": { "plans": { "type": "array", "items": { "type": "object", "properties": { "name": {"type": "string"}, "price": {"type": "string"}, "features": {"type": "array", "items": {"type": "string"}} } } } } } } }) ``` ### 4. Map (Discover URLs) ```python links = app.map_url("https://example.com") print(f"Found {len(links)} pages") ``` ### 5. Self-Hosting ```bash git clone https://github.com/mendableai/firecrawl docker compose up -d # API at http://localhost:3002 ``` ## Use Cases | Use Case | How | |----------|-----| | RAG Pipeline | Crawl docs → markdown → embed → vector DB | | Competitive Intel | Scrape competitor pricing pages | | Training Data | Extract clean text from web sources | | Monitoring | Track website changes over time | ## Pricing | Tier | Pages/mo | Price | |------|----------|-------| | Free | 500 | $0 | | Hobby | 3,000 | $16/mo | | Standard | 100,000 | $83/mo | | Self-hosted | Unlimited | Free | ## FAQ **Q: How does it handle JavaScript-heavy sites?** A: Firecrawl uses headless browsers to render JavaScript before extraction. **Q: Can I self-host?** A: Yes, fully open-source. Docker Compose deployment available. **Q: How does it compare to Jina Reader?** A: Firecrawl offers full site crawling, structured extraction, and sitemap discovery. Jina Reader is simpler (URL prefix for single pages). ## Source & Thanks > Created by [Mendable](https://github.com/mendableai). Licensed under AGPL-3.0. > > [mendableai/firecrawl](https://github.com/mendableai/firecrawl) — 30k+ stars ## Quick Start ```bash pip install firecrawl-py ``` Turn any website into AI-friendly Markdown in three lines. ## What is Firecrawl? Firecrawl is a web scraping API built for AI applications. Converts websites into clean Markdown or structured data while handling JS rendering, anti-detection, and batch crawling. **In one sentence**: Web scraping API that turns websites into LLM-consumable Markdown — supports JS rendering, structured extraction, and full-site crawling — 30k+ GitHub stars. **For**: AI teams building RAG pipelines or data extraction workflows. ## Core Features ### 1. Single-Page Scraping Get clean Markdown with one line of code. ### 2. Full-Site Crawl Automatic link discovery with depth and path filters. ### 3. Structured Extraction Define output format with JSON Schema. ### 4. Self-Hostable Deploy with Docker Compose — no limits. ## FAQ **Q: Does it support JS rendering?** A: Yes — uses a headless browser to render before extracting. **Q: How does it compare to Jina Reader?** A: Firecrawl offers full-site crawling and structured extraction; Jina Reader is simpler (single-page URL prefix). ## Source & Thanks > [mendableai/firecrawl](https://github.com/mendableai/firecrawl) — 30k+ stars, AGPL-3.0 --- Source: https://tokrepo.com/en/workflows/firecrawl-web-scraping-api-ai-applications-6a62a986 Author: Firecrawl