# Chandra — OCR Model for Complex Tables, Forms, and Handwriting

> High-accuracy OCR model that handles structured documents with complex tables, nested forms, and handwritten annotations while preserving full layout fidelity.

## Install

Save as a script file and run:

# Chandra — OCR Model for Complex Tables, Forms, and Handwriting

## Quick Use
```bash
pip install chandra-ocr
chandra extract document.pdf --output result.json
# Or use the Python API:
from chandra import extract
result = extract("document.pdf")
print(result.markdown)
```

## Introduction
Chandra is an open-source OCR model built to handle the documents that standard OCR tools struggle with: dense tables with merged cells, multi-column forms, handwritten annotations, and mixed-layout pages. It preserves the full spatial structure of the document, outputting structured data rather than flat text streams.

## What Chandra Does
- Extracts text from complex tables with merged cells, nested headers, and spanning rows
- Recognizes handwritten text alongside printed content in the same document
- Preserves document layout including columns, sections, and spatial relationships
- Outputs structured formats (JSON, Markdown, HTML) that maintain table and form structure
- Processes scanned PDFs, photographs of documents, and screenshots

## Architecture Overview
Chandra uses a vision-language model architecture with a layout-aware encoder that segments the document into regions (text blocks, tables, figures, handwriting) before applying specialized decoders for each region type. The table decoder uses a cell-graph approach that explicitly models row and column relationships, while the handwriting decoder uses an attention-based sequence model trained on diverse writing styles.

## Self-Hosting & Configuration
- Install via pip with Python 3.10+ and PyTorch
- Download model weights automatically on first run or pre-download for offline use
- Configure GPU acceleration with CUDA or run on CPU for smaller documents
- Set output format (JSON, Markdown, HTML) and language preferences
- Integrate with document processing pipelines via the Python API or CLI

## Key Features
- Table extraction that correctly handles merged cells, multi-line cells, and nested tables
- Handwriting recognition supporting multiple scripts and writing styles
- Layout preservation that maintains reading order across complex multi-column pages
- Batch processing mode for high-throughput document pipelines
- Language support for documents mixing Latin, CJK, and other scripts

## Comparison with Similar Tools
- **Tesseract** — general-purpose OCR; Chandra excels at structured document understanding
- **Surya** — focused on multilingual text detection; Chandra adds table and form extraction
- **Nougat** — specialized for academic papers; Chandra handles any document type
- **Azure/Google Document AI** — cloud services; Chandra runs locally with no API costs

## FAQ
**Q: Does it require a GPU?**
A: A GPU is recommended for speed but not required. CPU inference works for smaller documents.

**Q: What input formats are supported?**
A: PDF, PNG, JPEG, TIFF, and BMP. Multi-page PDFs are processed page by page.

**Q: How does it handle rotated or skewed documents?**
A: Chandra includes automatic deskewing and rotation correction as a preprocessing step.

**Q: Can I fine-tune it on my own document types?**
A: Yes. The training pipeline supports fine-tuning on custom labeled datasets.

## Sources
- https://github.com/datalab-to/chandra
- https://datalab.to/chandra

---
Source: https://tokrepo.com/en/workflows/asset-06d6a932
Author: Script Depot