ScriptsApr 15, 2026·3 min read

pgvector — Vector Similarity Search Inside PostgreSQL

A PostgreSQL extension that adds a native `vector` type, HNSW and IVFFlat indexes, and distance operators so semantic search, RAG and recommendation workloads can reuse the same database as the rest of the app.

Introduction

pgvector brings dense-vector storage, indexing and distance functions to PostgreSQL, turning your existing SQL database into a first-class vector store. You keep transactions, joins, row-level security and backups while adding semantic search, RAG retrieval or recommendation ranking — no separate vector DB to operate.

What pgvector Does

  • Adds the vector, halfvec, sparsevec and bit types for dense/sparse embeddings.
  • Ships distance operators: <-> L2, <=> cosine, <#> inner product, <+> L1.
  • Builds approximate indexes with HNSW (high recall) or IVFFlat (low memory).
  • Supports exact nearest-neighbour search when you need 100% recall on small sets.
  • Integrates with every PG client — LangChain, LlamaIndex, Django, Rails, Drizzle all speak pgvector.

Architecture Overview

Vectors are stored as fixed-size float4 arrays inside a regular Postgres heap, so MVCC, WAL and streaming replication Just Work. HNSW indexes store a layered proximity graph in shared buffers; IVFFlat builds k-means centroids and stores vectors in posting lists. Queries use Postgres's executor, so you can combine ORDER BY embedding <=> $1 with WHERE filters, joins and pagination in a single plan.

Self-Hosting & Configuration

  • Install via apt, yum, Homebrew, Docker (pgvector/pgvector:pg16), or compile from source.
  • Tune maintenance_work_mem (4–8 GB) before building HNSW indexes on large tables.
  • For HNSW: raise hnsw.ef_search at query time for higher recall; tune m/ef_construction at build time.
  • For IVFFlat: set lists ≈ √rows; keep probes small at first, raise until recall meets target.
  • Use halfvec(768) to cut index size in half when 16-bit precision is enough.

Key Features

  • Works inside transactions — inserts and embeddings commit atomically with other rows.
  • Filtered ANN: combine WHERE tenant_id = ? with <=> using partial or composite indexes.
  • Parallel index builds and query execution on multi-core machines.
  • Exact search with no index for tiny datasets or unit tests.
  • Hybrid search with tsvector / ts_rank in the same query.

Comparison with Similar Tools

  • Qdrant / Weaviate / Milvus — dedicated vector DBs, richer filtering DSL, extra infra to run.
  • Pinecone — managed service with its own API and no SQL story.
  • Redis VSS — fast in-memory ANN; lacks transactions and SQL joins.
  • Elasticsearch kNN — good if you already run ES; weaker transactional semantics.
  • SQLite sqlite-vec — nice for edge, but single-writer and no concurrent MVCC.

FAQ

Q: How many vectors can pgvector handle? A: Hundreds of millions with HNSW on a beefy node; shard with Citus or partitioning beyond that. Q: HNSW or IVFFlat? A: HNSW for recall and low latency at query time; IVFFlat when index size and build speed matter more. Q: Do I need to normalise vectors? A: For cosine yes, or use vector_cosine_ops which normalises internally. Q: Is pgvector supported on managed Postgres? A: Yes — AWS RDS/Aurora, Google Cloud SQL, Azure, Supabase, Neon and Crunchy Bridge all enable it.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets