Elasticsearch — Distributed Search and Analytics Engine
Elasticsearch is the most popular search and analytics engine. It provides near-real-time full-text search, structured search, analytics, and logging across petabytes of data — powering search for Wikipedia, GitHub, Stack Overflow, and millions of applications.
What it is
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It provides near-real-time full-text search, structured search, analytics, and log aggregation across petabytes of data. It powers search for applications, log analysis pipelines, security analytics, and observability stacks.
Elasticsearch targets backend engineers, data engineers, and DevOps teams who need fast search and analytics over large datasets.
How it saves time or tokens
Elasticsearch indexes data in near real-time, so newly ingested documents are searchable within seconds. The distributed architecture scales horizontally by adding nodes. A query that would take minutes in a relational database returns in milliseconds in Elasticsearch.
The Query DSL supports complex queries (boolean, fuzzy, range, aggregation) without writing custom search code.
How to use
- Install Elasticsearch:
docker run -d -p 9200:9200 elasticsearch:8.13.0 - Create an index and map your fields
- Index documents via the REST API
- Query with the JSON Query DSL
Example
# Create an index with mappings
curl -X PUT 'localhost:9200/products' -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"name": { "type": "text" },
"price": { "type": "float" },
"category": { "type": "keyword" },
"description": { "type": "text" }
}
}
}'
# Index a document
curl -X POST 'localhost:9200/products/_doc' -H 'Content-Type: application/json' -d '{
"name": "Wireless Mouse",
"price": 29.99,
"category": "electronics",
"description": "Ergonomic wireless mouse with USB receiver"
}'
# Search with full-text query
curl -X GET 'localhost:9200/products/_search' -H 'Content-Type: application/json' -d '{
"query": {
"bool": {
"must": { "match": { "description": "wireless" } },
"filter": { "range": { "price": { "lte": 50 } } }
}
}
}'
Related on TokRepo
- Database tools -- Search and database engines
- Monitoring tools -- Log analytics and observability
Common pitfalls
- Elasticsearch consumes significant RAM for indexing; allocate at least half of available memory to the JVM heap
- Mapping changes on existing indexes require reindexing; plan your field types carefully before production
- The free tier (Basic license) covers core search; advanced security, cross-cluster replication, and ML features require a paid subscription
Frequently Asked Questions
PostgreSQL full-text search works for simple use cases without additional infrastructure. Elasticsearch is purpose-built for search with better relevance scoring, fuzzy matching, faceting, and horizontal scaling. Use PostgreSQL for light search; use Elasticsearch when search is a core feature.
ELK stands for Elasticsearch, Logstash, and Kibana. Logstash ingests and transforms log data. Elasticsearch stores and indexes it. Kibana provides dashboards and visualizations. Together they form a complete log analytics pipeline.
Yes. Elasticsearch supports dense vector fields and k-nearest neighbor (kNN) search. This enables semantic search and RAG pipelines alongside traditional keyword search in the same index.
Elasticsearch uses a dual license: Server Side Public License (SSPL) and Elastic License. The core features are free. Advanced features like searchable snapshots, cross-cluster replication, and machine learning require a paid subscription. OpenSearch is a fully open-source fork under Apache 2.0.
Add nodes to the cluster. Elasticsearch distributes shards across nodes automatically. For read-heavy workloads, add replica shards. For write-heavy workloads, increase the number of primary shards. Monitor with the cluster health API.
Citations (3)
- Elasticsearch GitHub— Elasticsearch is a distributed search and analytics engine
- Elastic Docs— Elasticsearch Query DSL documentation
- Apache Lucene— Apache Lucene search library
Related on TokRepo
Discussion
Related Assets
Moodle — Open-Source Learning Management System
The most widely used open-source learning platform, providing course management, assessments, and collaboration tools for educators and organizations worldwide.
Sylius — Headless E-Commerce Framework on Symfony
An open-source headless e-commerce platform built on Symfony and API Platform, designed for developers who need a customizable and API-first commerce solution.
Akaunting — Free Self-Hosted Accounting Software
A free, open-source online accounting application built on Laravel for small businesses and freelancers to manage invoices, expenses, and financial reports.