ScriptsMay 31, 2026·3 min read

Chandra — OCR Model for Complex Tables, Forms, and Handwriting

High-accuracy OCR model that handles structured documents with complex tables, nested forms, and handwritten annotations while preserving full layout fidelity.

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
Chandra
Direct install command
npx -y tokrepo@latest install 06d6a932-5ca8-11f1-9bc6-00163e2b0d79 --target codex

Run after dry-run confirms the install plan.

Introduction

Chandra is an open-source OCR model built to handle the documents that standard OCR tools struggle with: dense tables with merged cells, multi-column forms, handwritten annotations, and mixed-layout pages. It preserves the full spatial structure of the document, outputting structured data rather than flat text streams.

What Chandra Does

  • Extracts text from complex tables with merged cells, nested headers, and spanning rows
  • Recognizes handwritten text alongside printed content in the same document
  • Preserves document layout including columns, sections, and spatial relationships
  • Outputs structured formats (JSON, Markdown, HTML) that maintain table and form structure
  • Processes scanned PDFs, photographs of documents, and screenshots

Architecture Overview

Chandra uses a vision-language model architecture with a layout-aware encoder that segments the document into regions (text blocks, tables, figures, handwriting) before applying specialized decoders for each region type. The table decoder uses a cell-graph approach that explicitly models row and column relationships, while the handwriting decoder uses an attention-based sequence model trained on diverse writing styles.

Self-Hosting & Configuration

  • Install via pip with Python 3.10+ and PyTorch
  • Download model weights automatically on first run or pre-download for offline use
  • Configure GPU acceleration with CUDA or run on CPU for smaller documents
  • Set output format (JSON, Markdown, HTML) and language preferences
  • Integrate with document processing pipelines via the Python API or CLI

Key Features

  • Table extraction that correctly handles merged cells, multi-line cells, and nested tables
  • Handwriting recognition supporting multiple scripts and writing styles
  • Layout preservation that maintains reading order across complex multi-column pages
  • Batch processing mode for high-throughput document pipelines
  • Language support for documents mixing Latin, CJK, and other scripts

Comparison with Similar Tools

  • Tesseract — general-purpose OCR; Chandra excels at structured document understanding
  • Surya — focused on multilingual text detection; Chandra adds table and form extraction
  • Nougat — specialized for academic papers; Chandra handles any document type
  • Azure/Google Document AI — cloud services; Chandra runs locally with no API costs

FAQ

Q: Does it require a GPU? A: A GPU is recommended for speed but not required. CPU inference works for smaller documents.

Q: What input formats are supported? A: PDF, PNG, JPEG, TIFF, and BMP. Multi-page PDFs are processed page by page.

Q: How does it handle rotated or skewed documents? A: Chandra includes automatic deskewing and rotation correction as a preprocessing step.

Q: Can I fine-tune it on my own document types? A: Yes. The training pipeline supports fine-tuning on custom labeled datasets.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets