Scripts2026年3月29日·1 分钟阅读

Firecrawl — Web Scraping API for LLMs

Turn any website into LLM-ready markdown. API-first web scraping with JavaScript rendering, auto-pagination, structured extraction, and batch crawling.

TO
TokRepo精选 · Community
快速使用

先拿来用,再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

pip install firecrawl-py
from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="fc-...")
result = app.scrape_url("https://example.com", params={"formats": ["markdown"]})
print(result["markdown"])

介绍

Firecrawl is an API that converts any webpage into clean markdown optimized for LLMs. Handles JavaScript rendering, anti-bot measures, pagination, and sitemaps automatically.

Best for: RAG data ingestion, web research, content monitoring, competitive analysis Works with: LangChain, LlamaIndex, any LLM pipeline


Features

  • Scrape — Single URL to markdown/HTML/structured data
  • Crawl — Entire site crawling with depth control
  • Map — Get all URLs from a website
  • Extract — Schema-based structured data extraction via LLM
  • Batch — Process thousands of URLs concurrently

Self-Hosted

git clone https://github.com/mendableai/firecrawl.git
docker compose up

🙏

来源与感谢

Created by Mendable. Licensed under AGPL-3.0. mendableai/firecrawl — 25K+ GitHub stars

相关资产