# Tantivy — Full-Text Search Engine Library for Rust > Tantivy is a high-performance full-text search engine library written in Rust, inspired by Apache Lucene, providing indexing and search capabilities that can be embedded into any application. ## Install Save as a script file and run: # Tantivy — Full-Text Search Engine Library for Rust ## Quick Use ```bash # Add tantivy to your Rust project cargo add tantivy # Or try the CLI search tool cargo install tantivy-cli # Create an index and add documents tantivy new -i ./my_index tantivy index -i ./my_index < documents.json tantivy search -i ./my_index -q "search query" ``` ## Introduction Tantivy is a full-text search engine library written in Rust, designed as a Lucene-equivalent for the Rust ecosystem. It provides fast indexing and querying of text, numeric, and geo-spatial data, and can be embedded directly into applications without running a separate search server. ## What Tantivy Does - Indexes and searches text documents with BM25 scoring and term-level queries - Supports boolean, phrase, range, regex, and fuzzy search queries - Handles numeric, date, faceted, and IP address field types - Provides concurrent indexing with near-real-time search visibility - Offers Python bindings via the tantivy-py package for cross-language use ## Architecture Overview Tantivy stores data in segments, each containing an inverted index, column store, and positional data. Writes go to an in-memory segment that is periodically committed to disk. A merge policy compacts segments in the background. Searches fan out across segments and merge results. The architecture avoids global locks, allowing concurrent reads and writes. The storage format uses memory-mapped files for efficient I/O, and the codec compresses posting lists with bitpacking and SIMD-accelerated decoding. ## Self-Hosting & Configuration - Add `tantivy` as a Cargo dependency for embedded use in Rust applications - Use `tantivy-py` for Python integration via pip install - Configure schema with typed fields (TEXT, U64, F64, DATE, FACET, BYTES, IP) - Set indexing parameters like heap size, merge policy, and commit frequency - Deploy as part of your application binary with no external service dependencies ## Key Features - Written in safe Rust with no garbage collection pauses during indexing or search - Single-node performance comparable to or exceeding Lucene for many workloads - Supports configurable tokenizers including language-specific stemmers - Provides snippet generation and search result highlighting - Powers Quickwit, the distributed search engine, as its core indexing library ## Comparison with Similar Tools - **Apache Lucene** — the Java equivalent, mature and widely used but requires JVM - **Bleve** — full-text search library for Go, similar embedded approach - **MeiliSearch** — search server with REST API, not an embeddable library - **Elasticsearch** — distributed search platform, much heavier for simple use cases - **Sonic** — lightweight search backend but fewer query features and field types ## FAQ **Q: Is Tantivy a search server like Elasticsearch?** A: No. Tantivy is a library you embed in your application. For a distributed search server built on Tantivy, see Quickwit. **Q: Can I use Tantivy from Python?** A: Yes. The tantivy-py package provides Python bindings for indexing and searching. **Q: How does Tantivy handle concurrent writes?** A: Tantivy uses a single IndexWriter with configurable thread pools. Multiple threads can add documents concurrently, and commits make them searchable. **Q: Does Tantivy support distributed search?** A: Tantivy itself is single-node. Quickwit builds distributed search on top of Tantivy for cluster deployments. ## Sources - https://github.com/quickwit-oss/tantivy - https://docs.rs/tantivy --- Source: https://tokrepo.com/en/workflows/fd82a53d-4491-11f1-9bc6-00163e2b0d79 Author: Script Depot