How do I install Chandra — OCR Model for Complex Tables, Forms, and Handwriting?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Chandra — OCR Model for Complex Tables, Forms, and Handwriting

Introduction

Chandra is an open-source OCR model built to handle the documents that standard OCR tools struggle with: dense tables with merged cells, multi-column forms, handwritten annotations, and mixed-layout pages. It preserves the full spatial structure of the document, outputting structured data rather than flat text streams.

What Chandra Does

Extracts text from complex tables with merged cells, nested headers, and spanning rows
Recognizes handwritten text alongside printed content in the same document
Preserves document layout including columns, sections, and spatial relationships
Outputs structured formats (JSON, Markdown, HTML) that maintain table and form structure
Processes scanned PDFs, photographs of documents, and screenshots

Architecture Overview

Chandra uses a vision-language model architecture with a layout-aware encoder that segments the document into regions (text blocks, tables, figures, handwriting) before applying specialized decoders for each region type. The table decoder uses a cell-graph approach that explicitly models row and column relationships, while the handwriting decoder uses an attention-based sequence model trained on diverse writing styles.

Self-Hosting & Configuration

Install via pip with Python 3.10+ and PyTorch
Download model weights automatically on first run or pre-download for offline use
Configure GPU acceleration with CUDA or run on CPU for smaller documents
Set output format (JSON, Markdown, HTML) and language preferences
Integrate with document processing pipelines via the Python API or CLI

Key Features

Table extraction that correctly handles merged cells, multi-line cells, and nested tables
Handwriting recognition supporting multiple scripts and writing styles
Layout preservation that maintains reading order across complex multi-column pages
Batch processing mode for high-throughput document pipelines
Language support for documents mixing Latin, CJK, and other scripts

Comparison with Similar Tools

Tesseract — general-purpose OCR; Chandra excels at structured document understanding
Surya — focused on multilingual text detection; Chandra adds table and form extraction
Nougat — specialized for academic papers; Chandra handles any document type
Azure/Google Document AI — cloud services; Chandra runs locally with no API costs

FAQ

Q: Does it require a GPU? A: A GPU is recommended for speed but not required. CPU inference works for smaller documents.

Q: What input formats are supported? A: PDF, PNG, JPEG, TIFF, and BMP. Multi-page PDFs are processed page by page.

Q: How does it handle rotated or skewed documents? A: Chandra includes automatic deskewing and rotation correction as a preprocessing step.

Q: Can I fine-tune it on my own document types? A: Yes. The training pipeline supports fine-tuning on custom labeled datasets.

Chandra — OCR Model for Complex Tables, Forms, and Handwriting

Agent 可直接安装

Introduction

What Chandra Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

CodeGeeX — Open Multilingual Code Generation Model

Tesseract OCR — Open Source Text Recognition Engine for 100+ Languages

PaddleOCR — AI-Powered OCR Toolkit for 100+ Languages

Surya — Document OCR for 90+ Languages