Scripts2026年4月8日·1 分钟阅读

Zerox — Zero-Shot PDF OCR for AI Pipelines

Extract text from any PDF using vision models as OCR. Zerox converts PDF pages to images then uses GPT-4o or Claude to extract clean markdown without training.

What is Zerox?

Zerox is a zero-shot PDF OCR tool that replaces traditional OCR with vision-language models. Converts PDF pages to images and uses GPT-4o or Claude to extract clean Markdown.

In one sentence: Vision model OCR — PDF to Markdown with GPT-4o/Claude/Gemini, handles complex layouts and tables — 7k+ stars.

For: Teams building RAG pipelines or handling PDFs.

Core Features

1. Multi-Model Support

Works with GPT-4o, Claude, and Gemini.

2. Page Selection

Process specific pages.

3. Custom Prompts

Customize extraction format.

FAQ

Q: How much does it cost? A: About $0.01 per page with GPT-4o-mini.

Q: Does it handle scans? A: This is its main use case — vision models read scanned text and handwriting.

🙏

来源与感谢

getomni-ai/zerox — 7k+ stars, MIT

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产