Document Processing

Mejores herramientas de IA para procesamiento de documentos (2026)

Motores OCR, parsers de PDF, comprensión de documentos y pipelines de extracción. Convierte documentos no estructurados en datos estructurados y buscables.

30 herramientas

MonkeyOCR — Lightweight Document Parsing Model

A lightweight large multimodal model optimized for accurate document parsing extracting text tables and structure from PDFs and images.

AI Open Source 24Configs

PyMuPDF — High-Performance PDF Processing for Python

PyMuPDF is a Python binding for the MuPDF library that provides fast, comprehensive PDF (and other document) processing. It supports text extraction, rendering, annotation, merging, form filling, and OCR across PDF, XPS, EPUB, and image formats.

Script Depot 13Scripts

MinerU — Extract LLM-Ready Data from Any Document

Convert PDFs, scans, and complex documents into clean Markdown or JSON for RAG and LLM pipelines. 57K+ GitHub stars.

Script Depot 494Scripts

Claude Official Skill: PDF — Read, Create & Edit PDFs

Claude Code skill for PDF files. Read content, extract data, create new PDFs, merge documents, and convert formats. Activates automatically.

Anthropic 418Skills

Zerox — Zero-Shot PDF OCR for AI Pipelines

Extract text from any PDF using vision models as OCR. Zerox converts PDF pages to images then uses GPT-4o or Claude to extract clean markdown without training.

Script Depot 412Skills

Kreuzberg — Polyglot Document Intelligence Framework with a Rust Core

An open-source document extraction framework that pulls text, metadata, images, and structured data from PDFs, Office files, images, and 97+ formats, with bindings for 11 programming languages.

Script Depot 402Skills

OpenDataLoader PDF — AI-Ready Document Parser

An open-source PDF parser that automates document accessibility and extracts structured, AI-ready data including tables, text, bounding boxes, and tagged content.

AI Open Source 359Skills

LiteParse — Fast Open-Source Document Parser in Rust

A fast, helpful, and open-source document parser by LlamaIndex that extracts structured text from PDFs and other documents with high speed and accuracy for RAG and AI pipelines.

Script Depot 184Scripts

DeepSeek-OCR — High-Accuracy Optical Context Compression

An OCR model and toolkit from DeepSeek AI that extracts text from images and documents with high accuracy, designed for feeding structured content into LLM pipelines.

AI Open Source 175Configs

Papermerge — Self-Hosted Document Management for Digital Archives

Papermerge is a self-hosted, open-source document management system with OCR, full-text search, and hierarchical folder organization for scanned documents and PDFs.

AI Open Source 102Configs

Xberg — Polyglot Document Intelligence Framework in Rust

A cross-language document extraction framework with a Rust core that parses PDFs, Office files, images, and 97+ formats into structured text and metadata.

AI Open Source 97Configs

Umi-OCR — Free Offline OCR Tool for Screenshots, Images & PDFs

Open-source, privacy-first OCR software that runs entirely offline. Supports batch image import, PDF recognition, QR code scanning, and multi-language text extraction without sending data to external servers.

Script Depot 47Scripts

DeepSeek OCR — Context-Aware Document Optical Compression

High-accuracy document OCR system by DeepSeek that converts scanned documents and PDFs into structured text with layout-aware compression.

AI Open Source 7Configs

Surya — Document OCR for 90+ Languages

Surya is a document OCR toolkit with 19.5K+ GitHub stars. Text recognition in 90+ languages, layout analysis, table detection, reading order, and LaTeX OCR. Benchmarks favorably against cloud OCR serv

Script Depot 678Skills

RAGFlow — Deep Document Understanding RAG Engine

Open-source RAG engine with deep document understanding. Parses complex PDFs, tables, images. Agent-powered Q&A with citations. Multi-model. 77K+ stars.

Script Depot 562Skills

Paperless-ngx — Self-Hosted Document Management with OCR

Paperless-ngx is an open-source document management system that scans, OCRs, indexes, and archives all your physical and digital documents for full-text search.

Script Depot 497Skills

Documenso — Open Source Document Signing Platform

Documenso is an open-source DocuSign alternative for self-hosted document signing with PDF e-signatures, audit trails, and Next.js stack.

AI Open Source 481Skills

Stirling PDF — Self-Hosted PDF Editor & Toolkit

Stirling PDF is the #1 open-source PDF tool on GitHub. Merge, split, convert, compress, OCR, sign, and edit PDFs — all self-hosted with no data leaving your server.

Script Depot 464Skills

Kotaemon — Open-Source RAG Document Chat

Clean, open-source RAG tool for chatting with your documents. Supports PDF, DOCX, web pages. Multi-model, citation, and multi-user. Self-hostable. 25K+ stars.

Script Depot 452Skills

Tesseract OCR — Open Source Text Recognition Engine for 100+ Languages

Tesseract is an open-source OCR engine maintained by Google, supporting over 100 languages. It converts images and scanned documents into machine-readable text with high accuracy across multiple output formats.

Script Depot 371Skills

Docling — Document Parsing for AI

IBM document parsing library. Converts PDFs, DOCX, PPTX, images, and HTML into structured markdown or JSON. Built for RAG pipelines and LLM ingestion.

Script Depot 364SkillsCLI Tools

BentoPDF — Privacy-First Self-Hosted PDF Toolkit

BentoPDF is a self-hosted web application that provides a comprehensive set of PDF tools including merging, splitting, converting, and OCR without sending files to external services.

AI Open Source 353Skills

Pandoc — Universal Document Format Converter

Pandoc is a universal document converter that reads and writes dozens of markup formats. It converts between Markdown, LaTeX, HTML, DOCX, EPUB, PDF, and many more with a single command.

Script Depot 326Skills

Claude Office Skills — Docs/PDF/Sheets Skill Set

A curated repo of office-focused skills (docs, PDF, spreadsheets) and an Office MCP server; copy skills into Claude Code to standardize document workflows.

Skill Factory 323Skills

Gotenberg — API-Driven Document Conversion and PDF Generation Server

Docker-powered API server for converting HTML, Markdown, Office documents, and URLs into PDFs using Chromium and LibreOffice.

Script Depot 315Skills

PaddleOCR — AI-Powered OCR Toolkit for 100+ Languages

A lightweight, production-ready OCR system supporting 100+ languages. Bridges documents and images to structured data for LLM pipelines.

Script Depot 307Skills

KOReader — Document Viewer for E-Ink Devices and Beyond

KOReader is a free, open-source document viewer optimized for e-ink readers like Kindle, Kobo, and PocketBook. It supports PDF, EPUB, DJVU, and many other formats with fine-grained rendering controls.

AI Open Source 297Skills

Nougat — Neural Optical Understanding for Academic Documents

Nougat is a visual transformer model from Meta that converts academic PDF pages into structured Markdown, accurately preserving mathematical equations, tables, and text formatting.

AI Open Source 225Skills

Grimmory — Self-Hosted eBook and Comics Library Server

Grimmory is a self-hosted digital library server for managing and reading eBooks, comics, and documents. It supports EPUB, PDF, CBR, CBZ, and MOBI formats with metadata management, OPDS feeds, and a responsive web reader.

AI Open Source 180Configs

pdfmake — Client-Server PDF Generation for JavaScript

Create complex PDF documents in the browser or Node.js using a declarative document-definition object.

Script Depot 176Scripts

Inteligencia documental con IA

AI Document Intelligence

AI document processing has leapfrogged traditional OCR. Modern tools don't just recognize characters — they understand document layout, hierarchy, tables, and semantic structure. OCR & Text Extraction — Surya delivers state-of-the-art multilingual OCR with layout detection. Marker converts PDFs to clean Markdown preserving structure. MinerU handles complex scientific papers with equations and diagrams.

Document ETL — DocETL and Unstructured build production pipelines that ingest PDFs, Word docs, scanned images, and HTML into normalized, chunked output ready for RAG or database storage. Translation & Accessibility — PDFMathTranslate preserves mathematical notation while translating academic papers across 100+ languages.

Knowledge Extraction — RAGFlow and Kotaemon combine document parsing with retrieval, letting you ask natural language questions over your document collection with source citations. MarkItDown converts any Office format to Markdown for AI processing.

The world's knowledge is trapped in PDFs — AI document tools are the key that unlocks it.

Preguntas frecuentes

¿Cuál es la mejor herramienta de IA para extraer texto de PDFs?+

Para PDFs generales: Marker convierte a Markdown limpio con excelente preservación del layout. Para documentos escaneados: Surya OCR maneja 90+ idiomas con precisión superior en layouts complejos. Para artículos científicos: MinerU se especializa en ecuaciones, tablas y figuras. Para pipelines de producción: Unstructured y DocETL ofrecen procesamiento de documentos de extremo a extremo con chunking y extracción de metadatos.

¿Puede la IA extraer tablas de PDFs con precisión?+

Sí. Las herramientas modernas como Surya, Marker y MinerU usan modelos de visión que entienden la estructura de tablas — cabeceras, celdas combinadas, filas que abarcan varias columnas — no solo líneas de cuadrícula. La precisión supera el 95 % en tablas bien formateadas. Para tablas complejas o inconsistentes, combinar varias herramientas (OCR + detección de layout + post-procesamiento LLM) da los mejores resultados.

¿Cómo proceso miles de documentos con IA?+

Usa herramientas de pipeline como DocETL o Unstructured que manejan batching, procesamiento paralelo y recuperación de errores. Normalizan distintos formatos (PDF, DOCX, imágenes, HTML) a un formato de salida único, extraen metadatos, fragmentan contenido para RAG y guardan resultados en tu base de datos o vector store. TokRepo aloja configs de pipeline preconfiguradas para los workflows comunes de procesamiento de documentos.

Explora categorías relacionadas

Herramientas de IA para RAG Herramientas de IA para Research Herramientas de IA para Documentation Herramientas de IA para Content Creation