Elasticsearch — Distributed Search and Analytics Engine
Elasticsearch is the most popular search and analytics engine. It provides near-real-time full-text search, structured search, analytics, and logging across petabytes of data — powering search for Wikipedia, GitHub, Stack Overflow, and millions of applications.
Installation avec revue préalable
Cet actif nécessite une revue. Le prompt copié demande un dry-run, affiche les écritures, puis continue seulement après confirmation.
npx -y tokrepo@latest install 8cbbd0e8-3734-11f1-9bc6-00163e2b0d79 --target codexDry-run d'abord, confirmez les écritures, puis lancez cette commande.
What it is
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It provides near-real-time full-text search, structured search, analytics, and log aggregation across petabytes of data. It powers search for applications, log analysis pipelines, security analytics, and observability stacks.
Elasticsearch targets backend engineers, data engineers, and DevOps teams who need fast search and analytics over large datasets.
How it saves time or tokens
Elasticsearch indexes data in near real-time, so newly ingested documents are searchable within seconds. The distributed architecture scales horizontally by adding nodes. A query that would take minutes in a relational database returns in milliseconds in Elasticsearch.
The Query DSL supports complex queries (boolean, fuzzy, range, aggregation) without writing custom search code.
How to use
- Install Elasticsearch:
docker run -d -p 9200:9200 elasticsearch:8.13.0 - Create an index and map your fields
- Index documents via the REST API
- Query with the JSON Query DSL
Example
# Create an index with mappings
curl -X PUT 'localhost:9200/products' -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"name": { "type": "text" },
"price": { "type": "float" },
"category": { "type": "keyword" },
"description": { "type": "text" }
}
}
}'
# Index a document
curl -X POST 'localhost:9200/products/_doc' -H 'Content-Type: application/json' -d '{
"name": "Wireless Mouse",
"price": 29.99,
"category": "electronics",
"description": "Ergonomic wireless mouse with USB receiver"
}'
# Search with full-text query
curl -X GET 'localhost:9200/products/_search' -H 'Content-Type: application/json' -d '{
"query": {
"bool": {
"must": { "match": { "description": "wireless" } },
"filter": { "range": { "price": { "lte": 50 } } }
}
}
}'
Related on TokRepo
- Database tools -- Search and database engines
- Monitoring tools -- Log analytics and observability
Common pitfalls
- Elasticsearch consumes significant RAM for indexing; allocate at least half of available memory to the JVM heap
- Mapping changes on existing indexes require reindexing; plan your field types carefully before production
- The free tier (Basic license) covers core search; advanced security, cross-cluster replication, and ML features require a paid subscription
Questions fréquentes
PostgreSQL full-text search works for simple use cases without additional infrastructure. Elasticsearch is purpose-built for search with better relevance scoring, fuzzy matching, faceting, and horizontal scaling. Use PostgreSQL for light search; use Elasticsearch when search is a core feature.
ELK stands for Elasticsearch, Logstash, and Kibana. Logstash ingests and transforms log data. Elasticsearch stores and indexes it. Kibana provides dashboards and visualizations. Together they form a complete log analytics pipeline.
Yes. Elasticsearch supports dense vector fields and k-nearest neighbor (kNN) search. This enables semantic search and RAG pipelines alongside traditional keyword search in the same index.
Elasticsearch uses a dual license: Server Side Public License (SSPL) and Elastic License. The core features are free. Advanced features like searchable snapshots, cross-cluster replication, and machine learning require a paid subscription. OpenSearch is a fully open-source fork under Apache 2.0.
Add nodes to the cluster. Elasticsearch distributes shards across nodes automatically. For read-heavy workloads, add replica shards. For write-heavy workloads, increase the number of primary shards. Monitor with the cluster health API.
Sources citées (3)
- Elasticsearch GitHub— Elasticsearch is a distributed search and analytics engine
- Elastic Docs— Elasticsearch Query DSL documentation
- Apache Lucene— Apache Lucene search library
En lien sur TokRepo
Fil de discussion
Actifs similaires
Quickwit — Cloud-Native Sub-Second Search Engine
Quickwit is a cloud-native search engine built in Rust for log management and distributed search on object storage. It indexes data directly to S3-compatible stores, enabling cost-efficient search at petabyte scale.
OpenSearch — Community-Driven Search and Analytics Suite
OpenSearch is an open-source search and analytics suite forked from Elasticsearch 7.10. It provides full-text search, log analytics, observability, and security analytics — all under the Apache-2.0 license with no feature restrictions.
ParadeDB — Elasticsearch-Quality Search Inside Postgres
ParadeDB is an open-source PostgreSQL extension that brings full-text search, hybrid search, and analytics capabilities directly into Postgres, replacing the need for a separate Elasticsearch cluster.
ManticoreSearch — Fast Open-Source Search Engine with SQL
ManticoreSearch is a high-performance open-source search engine that supports full-text search, columnar storage, and real-time indexing. It provides a MySQL-compatible SQL interface and can serve as a drop-in replacement for Sphinx.