What is Firecrawl Extract — Structured Data from Any URL?

Firecrawl Extract pulls structured JSON from any URL using a Pydantic/Zod schema. Skip the regex/CSS dance — describe the shape, get clean data.

Is Firecrawl Extract — Structured Data from Any URL free to use?

Yes. Firecrawl Extract — Structured Data from Any URL is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Firecrawl Extract — Structured Data from Any URL?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Firecrawl Extract — Structured Data from Any URL

简介

Firecrawl Extract 是 Firecrawl 抓取层之上的结构化数据接口。传一个 URL 加一个 JSON schema，拿回校验过的数据。不用 CSS 选择器、不用 XPath、不用 regex —— Firecrawl 把页面过一遍 LLM 用你的 schema 提取，返回结果。适合爬电商、招聘、新闻、或任何结构相似但每个站点都不一样的来源。兼容 Firecrawl REST API / Python SDK / Node SDK / MCP server。装机时间 2 分钟（在 firecrawl.dev 注册拿 key）。

一发命中提取

from firecrawl import FirecrawlApp
from pydantic import BaseModel

app = FirecrawlApp(api_key="fc-YOUR-KEY")

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool
    rating: float | None

result = app.extract(
    urls=["https://store.example.com/widgets"],
    schema=Product.model_json_schema(),
    prompt="Extract the headline product on this page",
)

print(result.data)
# {'name': 'Widget Pro', 'price': 49.99, 'in_stock': True, 'rating': 4.6}

一次提取多个 URL

result = app.extract(
    urls=[
        "https://store.example.com/widget-1",
        "https://store.example.com/widget-2",
        "https://store.example.com/widget-3",
    ],
    schema={
        "type": "object",
        "properties": {
            "products": {
                "type": "array",
                "items": Product.model_json_schema(),
            }
        }
    },
)

当 MCP server 用

加到 MCP 配置：

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": { "FIRECRAWL_API_KEY": "fc-YOUR-KEY" }
    }
  }
}

之后 Claude Code / Cursor / Codex CLI 直接调用 firecrawl_scrape / firecrawl_extract / firecrawl_crawl / firecrawl_map。

成本对应准确度

端点	成本	用途
`/scrape`	1 credit	纯 markdown，没 LLM
`/extract`	1-5 credits	通过 LLM 拿结构化数据
`/crawl`	1 credit/页	多页站点扒
`/map`	免费	先发现一个域名上的所有 URL

FAQ

Q: Firecrawl Extract 免费吗？ A: 免费档每月 500 credits 用于测试。Hobby 套餐 $19/月，5000 credits。自托管（MIT 开源）免费但要自己维护爬虫基础设施。

Q: Extract 跟普通 Scrape 啥区别？ A: Scrape 返回页面原始 markdown。Extract 用 LLM + 你的 schema 跑一遍，返回校验过的结构化数据。Extract 单次贵但省掉后处理。

Q: Firecrawl 能自托管吗？ A: 能。Firecrawl 仓库是 MIT 开源，Docker 跑。规模大时自托管省钱，但要自己管 Playwright / 代理 / 队列。托管版上手快。

Firecrawl Extract — Structured Data from Any URL

这个资产会安全暂存

简介

一发命中提取

一次提取多个 URL

当 MCP server 用

成本对应准确度

FAQ

来源与感谢

讨论

相关资产

Firecrawl MCP — Web Scraping Server for AI Agents

Tavily Extract — Pull Clean Content from Any URL

Firecrawl — Web Scraping API for AI Applications

Firecrawl MCP — Web Search & Scrape Tools