# Chandra — OCR Model for Complex Tables, Forms, and Handwriting > High-accuracy OCR model that handles structured documents with complex tables, nested forms, and handwritten annotations while preserving full layout fidelity. ## Install Save as a script file and run: # Chandra — OCR Model for Complex Tables, Forms, and Handwriting ## Quick Use ```bash pip install chandra-ocr chandra extract document.pdf --output result.json # Or use the Python API: from chandra import extract result = extract("document.pdf") print(result.markdown) ``` ## Introduction Chandra is an open-source OCR model built to handle the documents that standard OCR tools struggle with: dense tables with merged cells, multi-column forms, handwritten annotations, and mixed-layout pages. It preserves the full spatial structure of the document, outputting structured data rather than flat text streams. ## What Chandra Does - Extracts text from complex tables with merged cells, nested headers, and spanning rows - Recognizes handwritten text alongside printed content in the same document - Preserves document layout including columns, sections, and spatial relationships - Outputs structured formats (JSON, Markdown, HTML) that maintain table and form structure - Processes scanned PDFs, photographs of documents, and screenshots ## Architecture Overview Chandra uses a vision-language model architecture with a layout-aware encoder that segments the document into regions (text blocks, tables, figures, handwriting) before applying specialized decoders for each region type. The table decoder uses a cell-graph approach that explicitly models row and column relationships, while the handwriting decoder uses an attention-based sequence model trained on diverse writing styles. ## Self-Hosting & Configuration - Install via pip with Python 3.10+ and PyTorch - Download model weights automatically on first run or pre-download for offline use - Configure GPU acceleration with CUDA or run on CPU for smaller documents - Set output format (JSON, Markdown, HTML) and language preferences - Integrate with document processing pipelines via the Python API or CLI ## Key Features - Table extraction that correctly handles merged cells, multi-line cells, and nested tables - Handwriting recognition supporting multiple scripts and writing styles - Layout preservation that maintains reading order across complex multi-column pages - Batch processing mode for high-throughput document pipelines - Language support for documents mixing Latin, CJK, and other scripts ## Comparison with Similar Tools - **Tesseract** — general-purpose OCR; Chandra excels at structured document understanding - **Surya** — focused on multilingual text detection; Chandra adds table and form extraction - **Nougat** — specialized for academic papers; Chandra handles any document type - **Azure/Google Document AI** — cloud services; Chandra runs locally with no API costs ## FAQ **Q: Does it require a GPU?** A: A GPU is recommended for speed but not required. CPU inference works for smaller documents. **Q: What input formats are supported?** A: PDF, PNG, JPEG, TIFF, and BMP. Multi-page PDFs are processed page by page. **Q: How does it handle rotated or skewed documents?** A: Chandra includes automatic deskewing and rotation correction as a preprocessing step. **Q: Can I fine-tune it on my own document types?** A: Yes. The training pipeline supports fine-tuning on custom labeled datasets. ## Sources - https://github.com/datalab-to/chandra - https://datalab.to/chandra --- Source: https://tokrepo.com/en/workflows/asset-06d6a932 Author: Script Depot