Key Features
- Markdown output — Clean, LLM-ready text extraction
- JavaScript rendering — Handles SPAs and dynamic content
- Structured extraction — CSS selectors, schema-based extraction
- Chunking strategies — Topic-based, fixed-size, or semantic chunking
- Media extraction — Images, links, metadata
- Rate limiting — Built-in politeness and throttling
- Async — Fast concurrent crawling