Introduction
LanguageTool is an open-source proofreading engine that checks grammar, style, and spelling in over 25 languages. It can run as a self-hosted HTTP API server, giving teams and individuals a private alternative to cloud-based grammar checkers without sending text to third-party services.
What LanguageTool Does
- Detects grammar errors, style issues, and spelling mistakes in 25+ languages
- Runs as an HTTP API server that accepts text and returns annotated corrections
- Provides browser extensions, IDE plugins, and office suite add-ons
- Supports custom rule definitions in XML or Java for domain-specific checks
- Offers an n-gram dataset integration for improved error detection based on word frequency
Architecture Overview
LanguageTool is a Java application built on a rule-based and statistical hybrid engine. The core processes text through a pipeline: tokenization, sentence splitting, POS tagging, and then rule matching. Rules are defined per language in XML files or Java classes. The HTTP server wraps this engine behind a JSON API. Optional n-gram datasets (multi-GB word frequency data) improve detection accuracy for commonly confused words.
Self-Hosting & Configuration
- Deploy via Docker or run the JAR file directly with Java 8+
- Configure
server.propertiesfor port, max text length, and allowed origins - Download optional n-gram datasets for English, German, French, and other languages to improve accuracy
- Set memory limits appropriately (2-4 GB recommended for n-gram mode)
- Integrate with a reverse proxy for HTTPS termination and rate limiting
Key Features
- Multilingual support: English, German, French, Spanish, Portuguese, Dutch, and 20+ more
- Rule-based engine with thousands of grammar and style patterns per language
- Custom rule authoring via XML for organization-specific terminology and style guides
- REST API with JSON responses including error position, message, and suggested replacements
- Browser extensions (Firefox, Chrome) and editor plugins (LibreOffice, Google Docs) can point to a self-hosted instance
Comparison with Similar Tools
- Grammarly — cloud-only commercial service; LanguageTool is self-hostable and open source
- Vale — prose linter for technical writing; LanguageTool covers grammar across 25+ natural languages
- Hunspell — spell checker only; LanguageTool adds grammar and style checking
- Sapling — AI writing assistant API; LanguageTool is rule-based with optional statistical models, fully self-hosted
FAQ
Q: Can I point the browser extension to my own server? A: Yes. Both the Firefox and Chrome extensions allow setting a custom LanguageTool server URL in their options.
Q: How much memory does the server need? A: The base server runs with 512 MB. With n-gram datasets enabled, allocate 2-4 GB for optimal performance.
Q: Can I add custom rules for my team's style guide? A: Yes. Write rules in XML and place them in the rules directory, or implement Java-based rules for complex pattern matching.
Q: Does LanguageTool support real-time checking? A: The API responds in milliseconds for typical paragraphs. Editors and browser extensions call the API on keystroke debounce for a near-real-time experience.