Introduction
pgvector brings dense-vector storage, indexing and distance functions to PostgreSQL, turning your existing SQL database into a first-class vector store. You keep transactions, joins, row-level security and backups while adding semantic search, RAG retrieval or recommendation ranking — no separate vector DB to operate.
What pgvector Does
- Adds the
vector,halfvec,sparsevecandbittypes for dense/sparse embeddings. - Ships distance operators:
<->L2,<=>cosine,<#>inner product,<+>L1. - Builds approximate indexes with HNSW (high recall) or IVFFlat (low memory).
- Supports exact nearest-neighbour search when you need 100% recall on small sets.
- Integrates with every PG client — LangChain, LlamaIndex, Django, Rails, Drizzle all speak pgvector.
Architecture Overview
Vectors are stored as fixed-size float4 arrays inside a regular Postgres heap, so MVCC, WAL and streaming replication Just Work. HNSW indexes store a layered proximity graph in shared buffers; IVFFlat builds k-means centroids and stores vectors in posting lists. Queries use Postgres's executor, so you can combine ORDER BY embedding <=> $1 with WHERE filters, joins and pagination in a single plan.
Self-Hosting & Configuration
- Install via apt, yum, Homebrew, Docker (
pgvector/pgvector:pg16), or compile from source. - Tune
maintenance_work_mem(4–8 GB) before building HNSW indexes on large tables. - For HNSW: raise
hnsw.ef_searchat query time for higher recall; tunem/ef_constructionat build time. - For IVFFlat: set
lists≈ √rows; keepprobessmall at first, raise until recall meets target. - Use
halfvec(768)to cut index size in half when 16-bit precision is enough.
Key Features
- Works inside transactions — inserts and embeddings commit atomically with other rows.
- Filtered ANN: combine
WHERE tenant_id = ?with<=>using partial or composite indexes. - Parallel index builds and query execution on multi-core machines.
- Exact search with no index for tiny datasets or unit tests.
- Hybrid search with
tsvector/ts_rankin the same query.
Comparison with Similar Tools
- Qdrant / Weaviate / Milvus — dedicated vector DBs, richer filtering DSL, extra infra to run.
- Pinecone — managed service with its own API and no SQL story.
- Redis VSS — fast in-memory ANN; lacks transactions and SQL joins.
- Elasticsearch kNN — good if you already run ES; weaker transactional semantics.
- SQLite sqlite-vec — nice for edge, but single-writer and no concurrent MVCC.
FAQ
Q: How many vectors can pgvector handle?
A: Hundreds of millions with HNSW on a beefy node; shard with Citus or partitioning beyond that.
Q: HNSW or IVFFlat?
A: HNSW for recall and low latency at query time; IVFFlat when index size and build speed matter more.
Q: Do I need to normalise vectors?
A: For cosine yes, or use vector_cosine_ops which normalises internally.
Q: Is pgvector supported on managed Postgres?
A: Yes — AWS RDS/Aurora, Google Cloud SQL, Azure, Supabase, Neon and Crunchy Bridge all enable it.