What is Firecrawl — Web Scraping API for AI Applications?

Turn any website into clean markdown or structured data for LLMs. Firecrawl handles JavaScript rendering, anti-bot bypassing, sitemaps, and batch crawling via simple API.

Is Firecrawl — Web Scraping API for AI Applications free to use?

Yes. Firecrawl — Web Scraping API for AI Applications is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Firecrawl — Web Scraping API for AI Applications?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Firecrawl — Web Scraping API for AI Applications

What is Firecrawl?

Firecrawl is a web scraping API designed for AI applications. It converts any website into clean markdown or structured data that LLMs can consume. It handles JavaScript rendering, anti-bot detection, rate limiting, and sitemap discovery — so you can focus on building your AI pipeline.

Answer-Ready: Firecrawl is a web scraping API that converts websites into clean markdown or structured data for LLMs. Handles JavaScript rendering, anti-bot bypassing, and batch crawling. Used by major AI companies for RAG and training data. 30k+ GitHub stars.

Best for: AI teams building RAG pipelines or data extraction workflows. Works with: Any LLM framework, LangChain, LlamaIndex, Claude Code. Setup time: Under 2 minutes.

Core Features

1. Single Page Scrape

result = app.scrape_url("https://example.com", params={
    "formats": ["markdown", "html", "links"],
    "onlyMainContent": True,  # Strip nav, footer, ads
})

2. Full Site Crawl

crawl = app.crawl_url("https://docs.example.com", params={
    "limit": 500,           # Max pages
    "maxDepth": 3,          # Link depth
    "includePaths": ["/docs/*"],
    "excludePaths": ["/blog/*"],
})

3. Structured Extraction

result = app.scrape_url("https://example.com/pricing", params={
    "formats": ["extract"],
    "extract": {
        "schema": {
            "type": "object",
            "properties": {
                "plans": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": {"type": "string"},
                            "price": {"type": "string"},
                            "features": {"type": "array", "items": {"type": "string"}}
                        }
                    }
                }
            }
        }
    }
})

4. Map (Discover URLs)

links = app.map_url("https://example.com")
print(f"Found {len(links)} pages")

5. Self-Hosting

git clone https://github.com/mendableai/firecrawl
docker compose up -d
# API at http://localhost:3002

Use Cases

Use Case	How
RAG Pipeline	Crawl docs → markdown → embed → vector DB
Competitive Intel	Scrape competitor pricing pages
Training Data	Extract clean text from web sources
Monitoring	Track website changes over time

Pricing

Tier	Pages/mo	Price
Free	500	$0
Hobby	3,000	$16/mo
Standard	100,000	$83/mo
Self-hosted	Unlimited	Free

FAQ

Q: How does it handle JavaScript-heavy sites? A: Firecrawl uses headless browsers to render JavaScript before extraction.

Q: Can I self-host? A: Yes, fully open-source. Docker Compose deployment available.

Q: How does it compare to Jina Reader? A: Firecrawl offers full site crawling, structured extraction, and sitemap discovery. Jina Reader is simpler (URL prefix for single pages).

Firecrawl — Web Scraping API for AI Applications

Use it first, then decide how deep to go

What is Firecrawl?

Core Features

1. Single Page Scrape

2. Full Site Crawl

3. Structured Extraction

4. Map (Discover URLs)

5. Self-Hosting

Use Cases

Pricing

FAQ

Source & Thanks

Discussion

Related Assets

Cursor Tips — Advanced AI Coding Workflow Guide