Introduction
ParadeDB extends PostgreSQL with Elasticsearch-grade full-text search and analytics capabilities. Instead of syncing data to a separate search engine, you run BM25 search, sparse vector queries, and columnar analytics directly inside your Postgres database.
What ParadeDB Does
- Adds BM25 full-text search to Postgres via the pg_search extension
- Supports hybrid search combining BM25 text scoring with vector similarity
- Provides columnar storage and vectorized analytics via the pg_analytics extension
- Integrates with standard Postgres tooling (psql, ORMs, connection poolers)
- Eliminates the need for a separate Elasticsearch or OpenSearch deployment
Architecture Overview
ParadeDB ships two core Postgres extensions. pg_search implements an inverted index using the Tantivy search library (Rust-based Lucene equivalent) and exposes BM25 scoring through a custom Postgres operator. pg_analytics adds a columnar table access method built on Apache Arrow and DataFusion for analytical queries. Both extensions run inside the Postgres process and use standard Postgres transactions.
Self-Hosting & Configuration
- Run the ParadeDB Docker image, which is standard Postgres with extensions pre-installed
- Alternatively, install pg_search and pg_analytics as extensions on an existing Postgres instance
- Create BM25 indexes with standard CREATE INDEX syntax using the bm25 access method
- Configure tokenizers, analyzers, and search options per index
- Compatible with managed Postgres providers that support custom extensions
Key Features
- BM25 full-text search with relevance scoring, phrase matching, and faceted search
- Hybrid search combining keyword (BM25) and semantic (vector) results in one query
- Sparse vector search for learned sparse retrieval models like SPLADE
- Columnar analytics with vectorized execution for OLAP-style queries
- Standard SQL interface with no proprietary query language to learn
Comparison with Similar Tools
- Elasticsearch — Purpose-built search engine with its own query DSL and operational complexity; ParadeDB runs inside Postgres
- OpenSearch — AWS fork of Elasticsearch with similar architecture; still requires a separate cluster
- Meilisearch — Fast typo-tolerant search API, but standalone service and not SQL-based
- Typesense — Developer-friendly search engine, but requires data sync from your primary database
- pgvector — Postgres extension for vector similarity only; ParadeDB adds BM25 text search and hybrid scoring
FAQ
Q: Can I use ParadeDB with my existing PostgreSQL database? A: Yes. pg_search and pg_analytics are standard Postgres extensions that can be installed on any Postgres 14+ instance.
Q: Does ParadeDB replace Elasticsearch for all use cases? A: ParadeDB covers full-text search, hybrid search, and basic analytics. For specialized use cases like log aggregation at massive scale, a dedicated search engine may still be appropriate.
Q: How does search performance compare to Elasticsearch? A: ParadeDB uses Tantivy, a Rust-based search library that benchmarks competitively with Lucene. For most application search workloads, performance is comparable.
Q: What is the license? A: ParadeDB is licensed under AGPL-3.0.