Cette page est affichée en anglais. Une traduction française est en cours.
ConfigsMar 31, 2026·2 min de lecture

GPT Crawler — Build Custom GPTs from Any Website

Crawl any website to generate knowledge files for custom GPTs and RAG. Output as JSON for OpenAI GPTs or any LLM knowledge base. Zero config. 22K+ stars.

Introduction

GPT Crawler turns any website into a knowledge file for custom GPTs and RAG pipelines. Point it at documentation, help centers, or any website — it crawls pages, extracts clean text, and outputs structured JSON ready for OpenAI's GPT Builder or any LLM knowledge base. Zero AI cost — it's a pure crawler, not an LLM app. 22,000+ GitHub stars, ISC licensed.

Best for: Creating custom GPTs from documentation sites, building RAG knowledge bases from web content Works with: OpenAI GPTs, Claude Projects, any RAG pipeline (LangChain, LlamaIndex)


Key Features

One-Command Crawl

Point at any URL with a glob pattern — get structured JSON output.

Smart Extraction

Extracts main content, strips navigation/ads/boilerplate. Clean text optimized for LLMs.

Configurable

  • maxPagesToCrawl — limit crawl depth
  • match — URL glob patterns to include/exclude
  • selector — CSS selector for content extraction
  • maxTokens — limit output size for GPT upload

Output Formats

JSON array of {title, url, text} objects — ready for:

  • OpenAI GPT Builder (upload as knowledge)
  • Claude Projects (upload as context)
  • Any RAG vector store ingestion

FAQ

Q: What is GPT Crawler? A: A tool that crawls any website and outputs structured JSON for creating custom GPTs and RAG knowledge bases. No AI cost — pure web crawling. 22K+ stars.

Q: How is it different from Crawl4AI or Firecrawl? A: GPT Crawler is simpler — focused specifically on generating GPT knowledge files. Crawl4AI and Firecrawl offer more features (JS rendering, structured extraction, APIs).


🙏

Source et remerciements

Created by Builder.io. Licensed under ISC. BuilderIO/gpt-crawler — 22,000+ GitHub stars

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires