What is Firecrawl Extract — Structured Data from Any URL?

Firecrawl Extract pulls structured JSON from any URL using a Pydantic/Zod schema. Skip the regex/CSS dance — describe the shape, get clean data.

Is Firecrawl Extract — Structured Data from Any URL free to use?

Yes. Firecrawl Extract — Structured Data from Any URL is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Firecrawl Extract — Structured Data from Any URL?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Firecrawl Extract — Structured Data from Any URL

Name: Firecrawl Extract — Structured Data from Any URL
Author: Firecrawl

from firecrawl import FirecrawlApp from pydantic import BaseModel app = FirecrawlApp(api_key="fc-YOUR-KEY") class Product(BaseModel): name: str price: float in_stock: bool rating: float | None result = app.extract( urls=["https://store.example.com/widgets"], schema=Product.model_json_schema(), prompt="Extract the headline product on this page", ) print(result.data) # {'name': 'Widget Pro', 'price': 49.99, 'in_stock': True, 'rating': 4.6}

result = app.extract( urls=[ "https://store.example.com/widget-1", "https://store.example.com/widget-2", "https://store.example.com/widget-3", ], schema={ "type": "object", "properties": { "products": { "type": "array", "items": Product.model_json_schema(), } } }, )

Endpoint

Cost

Use

/scrape

1 credit

Just markdown, no LLM

/extract

1-5 credits

Structured data via LLM

/crawl

1 credit/page

Multi-page site dump

/map

Free

Discover all URLs on a domain first

Quick Use

Sign up at firecrawl.dev — get an API key (free 500 credits)
pip install firecrawl-py (or npm install @mendable/firecrawl-js)
Use the Pydantic-schema extract snippet below

Intro

Firecrawl Extract is the structured-data endpoint on top of Firecrawl's scraper. Pass a URL and a JSON schema; get back validated data. No CSS selectors, no XPath, no regex — Firecrawl runs the page through an LLM with your schema and returns the result. Best for: agents that scrape e-commerce, job boards, news sites, or any structured-but-different-each-site source. Works with: Firecrawl REST API, Firecrawl Python / Node SDK, MCP server. Setup time: 2 minutes (sign up at firecrawl.dev for API key).

One-shot extract

from firecrawl import FirecrawlApp
from pydantic import BaseModel

app = FirecrawlApp(api_key="fc-YOUR-KEY")

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool
    rating: float | None

result = app.extract(
    urls=["https://store.example.com/widgets"],
    schema=Product.model_json_schema(),
    prompt="Extract the headline product on this page",
)

print(result.data)
# {'name': 'Widget Pro', 'price': 49.99, 'in_stock': True, 'rating': 4.6}

Extract across many URLs at once

result = app.extract(
    urls=[
        "https://store.example.com/widget-1",
        "https://store.example.com/widget-2",
        "https://store.example.com/widget-3",
    ],
    schema={
        "type": "object",
        "properties": {
            "products": {
                "type": "array",
                "items": Product.model_json_schema(),
            }
        }
    },
)

Use as MCP server

Add to your MCP config:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": { "FIRECRAWL_API_KEY": "fc-YOUR-KEY" }
    }
  }
}

Now Claude Code / Cursor / Codex CLI can call firecrawl_scrape, firecrawl_extract, firecrawl_crawl, firecrawl_map directly.

Cost vs accuracy

Endpoint	Cost	Use
`/scrape`	1 credit	Just markdown, no LLM
`/extract`	1-5 credits	Structured data via LLM
`/crawl`	1 credit/page	Multi-page site dump
`/map`	Free	Discover all URLs on a domain first

FAQ

Q: Is Firecrawl Extract free? A: Free tier: 500 credits/month for testing. Hobby plan starts at $19/mo for 5K credits. Self-hosted (open-source MIT license) is free but you run your own crawler infrastructure.

Q: How is Extract different from regular Scrape? A: Scrape returns the raw markdown of a page. Extract runs that through an LLM with your schema and returns validated structured data. Extract is more expensive per call but skips post-processing entirely.

Q: Can I self-host Firecrawl? A: Yes. The Firecrawl repo is MIT-licensed and runs on Docker. Self-hosting saves money at scale but you manage the Playwright/proxies/queue. Hosted is faster to start.

Source & Thanks

Built by Firecrawl (Mendable). Licensed under MIT (self-host) / commercial (hosted).

firecrawl/firecrawl — ⭐ 30,000+

Firecrawl Extract — Structured Data from Any URL

Cet actif peut être lu et installé directement par les agents

One-shot extract

Extract across many URLs at once

Use as MCP server

Cost vs accuracy

FAQ

Quick Use

Intro

One-shot extract

Extract across many URLs at once

Use as MCP server

Cost vs accuracy

FAQ

Source & Thanks

Source et remerciements

Fil de discussion

Actifs similaires

Firecrawl MCP — Web Scraping Server for AI Agents

Tavily Extract — Pull Clean Content from Any URL

Firecrawl — Web Scraping API for AI Applications

Instructor — Structured LLM Outputs with Pydantic