# Vespa — Real-Time Big Data Serving Engine

> Vespa is an open-source big data serving engine from Yahoo that handles search, recommendation, and ranking at scale. It combines full-text search, vector search, and structured data queries in a single platform capable of serving thousands of queries per second over billions of documents.

## Install

Save as a script file and run:

# Vespa — Real-Time Big Data Serving Engine

## Quick Use
```bash
docker run --detach --name vespa --hostname vespa-container 
  --publish 8080:8080 --publish 19071:19071 
  vespaengine/vespa
# Deploy an application package via the config API on port 19071
```

## Introduction
Vespa was built at Yahoo to power search, ad serving, and recommendation systems at web scale. It unifies text search, vector similarity search, and structured queries in one engine, removing the need to stitch together separate databases for different query types.

## What Vespa Does
- Serves full-text, vector, and structured queries over billions of documents
- Supports real-time indexing with partial document updates at scale
- Executes custom ranking models including ONNX and TensorFlow at query time
- Provides built-in machine-learned ranking with phased evaluation
- Handles grouping, aggregation, and geo-spatial queries natively

## Architecture Overview
Vespa uses a content cluster of stateful nodes that store indexed data, a container cluster of stateless nodes that handle query processing and document feeding, and a config cluster that manages application deployment. Data is distributed and replicated across content nodes using a consistent hashing scheme. Ranking happens in two phases: a fast first-phase over all matches, then a detailed second-phase on top candidates.

## Self-Hosting & Configuration
- Deploy via Docker for development or use the Vespa Cloud managed service
- Define document schemas, ranking profiles, and query processing in an application package
- Feed documents via the JSON-based document API over HTTP
- Configure content distribution with redundancy and searchable-copies settings
- Use the Vespa CLI to deploy, test, and monitor applications

## Key Features
- Hybrid search combining BM25 text scoring with ANN vector similarity
- Real-time partial updates without full document reindexing
- Built-in ONNX Runtime for executing ML models during ranking
- Grouping engine for faceted navigation and analytics
- Automatic data distribution, rebalancing, and node failure recovery

## Comparison with Similar Tools
- **Elasticsearch** — strong for text search but lacks native vector and ML ranking; Vespa integrates both natively
- **Milvus** — specialized vector database; Vespa combines vectors with text and structured data
- **Apache Solr** — mature text search; Vespa adds real-time ML ranking and tensor computation
- **Weaviate** — vector database with modules; Vespa is a broader serving engine for mixed workloads
- **Qdrant** — lightweight vector search; Vespa handles larger scale with richer query semantics

## FAQ
**Q: Is Vespa only for search?**
A: No. Vespa powers recommendation, personalization, ad serving, and any application needing real-time computation over large datasets.

**Q: Can Vespa run ML models during queries?**
A: Yes. Vespa evaluates ONNX, TensorFlow, XGBoost, and LightGBM models as part of its ranking pipeline at query time.

**Q: How does Vespa handle scaling?**
A: Add content nodes and Vespa automatically redistributes data. The system scales reads by adding container nodes.

**Q: Is there a managed cloud option?**
A: Yes. Vespa Cloud provides a fully managed deployment with auto-scaling and monitoring.

## Sources
- https://github.com/vespa-engine/vespa
- https://docs.vespa.ai

---
Source: https://tokrepo.com/en/workflows/3148138a-3b64-11f1-9bc6-00163e2b0d79
Author: Script Depot