# Elasticsearch — Distributed Search and Analytics Engine > Elasticsearch is the most popular search and analytics engine. It provides near-real-time full-text search, structured search, analytics, and logging across petabytes of data — powering search for Wikipedia, GitHub, Stack Overflow, and millions of applications. ## Install Save as a script file and run: # Elasticsearch — Distributed Search and Analytics Engine ## Quick Use ```bash # Run with Docker docker run -d --name elasticsearch \ -p 9200:9200 -p 9300:9300 \ -e "discovery.type=single-node" \ -e "xpack.security.enabled=false" \ docker.elastic.co/elasticsearch/elasticsearch:8.15.0 # Index a document curl -X POST "localhost:9200/products/_doc" \ -H "Content-Type: application/json" \ -d '{"name": "Laptop", "price": 999, "description": "Fast laptop with 16GB RAM"}' # Search curl "localhost:9200/products/_search?q=fast+laptop" ``` ## Introduction Elasticsearch is the world's most deployed search engine. Built on Apache Lucene, it distributes search and analytics across clusters of machines, providing near-real-time results over billions of documents. It powers the search functionality of websites, log analysis (ELK Stack), application performance monitoring, and security analytics. With over 77,000 GitHub stars, Elasticsearch is used by Wikipedia, GitHub, Netflix, Uber, and virtually every major tech company for full-text search, log aggregation, metrics, and security information management. ## What Elasticsearch Does Elasticsearch stores JSON documents and makes them searchable in near-real-time. It automatically distributes data across nodes, handles replication for fault tolerance, and provides a powerful query DSL for full-text search, structured queries, aggregations, and geospatial queries. The ELK Stack (Elasticsearch + Logstash + Kibana) is the standard for log management. ## Architecture Overview ``` [Applications / Logstash / Beats] | [Elasticsearch Cluster] | +-------+-------+-------+ | | | | [Node 1] [Node 2] [Node 3] Primary Replica Replica shards shards shards | [Lucene Indexes] Inverted index for full-text search BKD trees for numerics Doc values for aggregations | [Query DSL] match, term, range, bool, nested, geo, aggregations | [Kibana] Visualization & dashboard UI ``` ## Self-Hosting & Configuration ```bash # Docker Compose with Kibana version: "3.8" services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:8.15.0 environment: - discovery.type=single-node - ES_JAVA_OPTS=-Xms2g -Xmx2g - xpack.security.enabled=false ports: - 9200:9200 volumes: - esdata:/usr/share/elasticsearch/data kibana: image: docker.elastic.co/kibana/kibana:8.15.0 ports: - 5601:5601 depends_on: - elasticsearch volumes: esdata: ``` ```json // Search query example { "query": { "bool": { "must": [ { "match": { "description": "fast laptop" } } ], "filter": [ { "range": { "price": { "lte": 1500 } } } ] } }, "aggs": { "avg_price": { "avg": { "field": "price" } }, "by_category": { "terms": { "field": "category.keyword" } } }, "highlight": { "fields": { "description": {} } } } ``` ## Key Features - **Full-Text Search** — powerful text analysis, stemming, and relevance scoring - **Near Real-Time** — documents are searchable within 1 second of indexing - **Distributed** — automatic sharding and replication across nodes - **Aggregations** — real-time analytics (metrics, buckets, pipelines) - **Query DSL** — rich JSON query language for complex searches - **ELK Stack** — complete log management with Logstash and Kibana - **Vector Search** — kNN search for AI/ML embeddings (8.0+) - **Cross-Cluster** — search and replicate across multiple clusters ## Comparison with Similar Tools | Feature | Elasticsearch | OpenSearch | Meilisearch | Typesense | Solr | |---|---|---|---|---|---| | Scale | Petabytes | Petabytes | Single-node | Cluster | Petabytes | | Query Language | Query DSL | Query DSL | Simple | Simple | Solr Query | | Analytics | Excellent | Excellent | No | No | Good | | Setup Complexity | Moderate | Moderate | Very Low | Low | High | | License | SSPL/Elastic | Apache-2.0 | MIT | GPL-3.0 | Apache-2.0 | | Kibana Equivalent | Kibana | OpenSearch Dashboards | Built-in UI | Built-in UI | Banana | | Best For | Enterprise search + logs | AWS/OSS alternative | Small apps | Small apps | Legacy | ## FAQ **Q: Elasticsearch vs OpenSearch — which should I choose?** A: OpenSearch is a fork created after Elastic changed its license. OpenSearch is Apache-2.0 (fully open source). Elasticsearch has more features and faster releases. Choose OpenSearch for true open-source; Elasticsearch for latest features. **Q: How much memory does Elasticsearch need?** A: Minimum 2GB heap for development. Production: allocate 50% of available RAM to Elasticsearch heap (max 32GB), leave the other 50% for OS filesystem cache. **Q: Is Elasticsearch free?** A: The basic features are free. Advanced features (security, machine learning, cross-cluster replication) require a paid subscription or the Elastic Cloud service. **Q: When should I NOT use Elasticsearch?** A: For primary data storage (use a database), simple key-value lookups (use Redis), or small datasets where a database LIKE query suffices. ## Sources - GitHub: https://github.com/elastic/elasticsearch - Documentation: https://www.elastic.co/docs - Created by Shay Banon (Elastic) - License: SSPL / Elastic License 2.0 --- Source: https://tokrepo.com/en/workflows/8cbbd0e8-3734-11f1-9bc6-00163e2b0d79 Author: Script Depot