ScriptsApr 13, 2026·3 min read

Elasticsearch — Distributed Search and Analytics Engine

Elasticsearch is the most popular search and analytics engine. It provides near-real-time full-text search, structured search, analytics, and logging across petabytes of data — powering search for Wikipedia, GitHub, Stack Overflow, and millions of applications.

TL;DR
Elasticsearch provides real-time full-text search, analytics, and log aggregation at petabyte scale.
§01

What it is

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It provides near-real-time full-text search, structured search, analytics, and log aggregation across petabytes of data. It powers search for applications, log analysis pipelines, security analytics, and observability stacks.

Elasticsearch targets backend engineers, data engineers, and DevOps teams who need fast search and analytics over large datasets.

§02

How it saves time or tokens

Elasticsearch indexes data in near real-time, so newly ingested documents are searchable within seconds. The distributed architecture scales horizontally by adding nodes. A query that would take minutes in a relational database returns in milliseconds in Elasticsearch.

The Query DSL supports complex queries (boolean, fuzzy, range, aggregation) without writing custom search code.

§03

How to use

  1. Install Elasticsearch: docker run -d -p 9200:9200 elasticsearch:8.13.0
  2. Create an index and map your fields
  3. Index documents via the REST API
  4. Query with the JSON Query DSL
§04

Example

# Create an index with mappings
curl -X PUT 'localhost:9200/products' -H 'Content-Type: application/json' -d '{
  "mappings": {
    "properties": {
      "name": { "type": "text" },
      "price": { "type": "float" },
      "category": { "type": "keyword" },
      "description": { "type": "text" }
    }
  }
}'

# Index a document
curl -X POST 'localhost:9200/products/_doc' -H 'Content-Type: application/json' -d '{
  "name": "Wireless Mouse",
  "price": 29.99,
  "category": "electronics",
  "description": "Ergonomic wireless mouse with USB receiver"
}'

# Search with full-text query
curl -X GET 'localhost:9200/products/_search' -H 'Content-Type: application/json' -d '{
  "query": {
    "bool": {
      "must": { "match": { "description": "wireless" } },
      "filter": { "range": { "price": { "lte": 50 } } }
    }
  }
}'
§05

Related on TokRepo

§06

Common pitfalls

  • Elasticsearch consumes significant RAM for indexing; allocate at least half of available memory to the JVM heap
  • Mapping changes on existing indexes require reindexing; plan your field types carefully before production
  • The free tier (Basic license) covers core search; advanced security, cross-cluster replication, and ML features require a paid subscription

Frequently Asked Questions

How does Elasticsearch compare to PostgreSQL full-text search?+

PostgreSQL full-text search works for simple use cases without additional infrastructure. Elasticsearch is purpose-built for search with better relevance scoring, fuzzy matching, faceting, and horizontal scaling. Use PostgreSQL for light search; use Elasticsearch when search is a core feature.

What is the ELK Stack?+

ELK stands for Elasticsearch, Logstash, and Kibana. Logstash ingests and transforms log data. Elasticsearch stores and indexes it. Kibana provides dashboards and visualizations. Together they form a complete log analytics pipeline.

Can Elasticsearch handle vector search?+

Yes. Elasticsearch supports dense vector fields and k-nearest neighbor (kNN) search. This enables semantic search and RAG pipelines alongside traditional keyword search in the same index.

What is the licensing model?+

Elasticsearch uses a dual license: Server Side Public License (SSPL) and Elastic License. The core features are free. Advanced features like searchable snapshots, cross-cluster replication, and machine learning require a paid subscription. OpenSearch is a fully open-source fork under Apache 2.0.

How do I scale Elasticsearch?+

Add nodes to the cluster. Elasticsearch distributes shards across nodes automatically. For read-heavy workloads, add replica shards. For write-heavy workloads, increase the number of primary shards. Monitor with the cluster health API.

Citations (3)

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets