How do I install pgvector — Vector Similarity Search Inside PostgreSQL?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

pgvector — Vector Similarity Search Inside PostgreSQL

Introduction

pgvector brings dense-vector storage, indexing and distance functions to PostgreSQL, turning your existing SQL database into a first-class vector store. You keep transactions, joins, row-level security and backups while adding semantic search, RAG retrieval or recommendation ranking — no separate vector DB to operate.

What pgvector Does

Adds the vector, halfvec, sparsevec and bit types for dense/sparse embeddings.
Ships distance operators: <-> L2, <=> cosine, <#> inner product, <+> L1.
Builds approximate indexes with HNSW (high recall) or IVFFlat (low memory).
Supports exact nearest-neighbour search when you need 100% recall on small sets.
Integrates with every PG client — LangChain, LlamaIndex, Django, Rails, Drizzle all speak pgvector.

Architecture Overview

Vectors are stored as fixed-size float4 arrays inside a regular Postgres heap, so MVCC, WAL and streaming replication Just Work. HNSW indexes store a layered proximity graph in shared buffers; IVFFlat builds k-means centroids and stores vectors in posting lists. Queries use Postgres's executor, so you can combine ORDER BY embedding <=> $1 with WHERE filters, joins and pagination in a single plan.

Self-Hosting & Configuration

Install via apt, yum, Homebrew, Docker (pgvector/pgvector:pg16), or compile from source.
Tune maintenance_work_mem (4–8 GB) before building HNSW indexes on large tables.
For HNSW: raise hnsw.ef_search at query time for higher recall; tune m/ef_construction at build time.
For IVFFlat: set lists ≈ √rows; keep probes small at first, raise until recall meets target.
Use halfvec(768) to cut index size in half when 16-bit precision is enough.

Key Features

Works inside transactions — inserts and embeddings commit atomically with other rows.
Filtered ANN: combine WHERE tenant_id = ? with <=> using partial or composite indexes.
Parallel index builds and query execution on multi-core machines.
Exact search with no index for tiny datasets or unit tests.
Hybrid search with tsvector / ts_rank in the same query.

Comparison with Similar Tools

Qdrant / Weaviate / Milvus — dedicated vector DBs, richer filtering DSL, extra infra to run.
Pinecone — managed service with its own API and no SQL story.
Redis VSS — fast in-memory ANN; lacks transactions and SQL joins.
Elasticsearch kNN — good if you already run ES; weaker transactional semantics.
SQLite sqlite-vec — nice for edge, but single-writer and no concurrent MVCC.

FAQ

Q: How many vectors can pgvector handle? A: Hundreds of millions with HNSW on a beefy node; shard with Citus or partitioning beyond that. Q: HNSW or IVFFlat? A: HNSW for recall and low latency at query time; IVFFlat when index size and build speed matter more. Q: Do I need to normalise vectors? A: For cosine yes, or use vector_cosine_ops which normalises internally. Q: Is pgvector supported on managed Postgres? A: Yes — AWS RDS/Aurora, Google Cloud SQL, Azure, Supabase, Neon and Crunchy Bridge all enable it.

pgvector — Vector Similarity Search Inside PostgreSQL

Introduction

What pgvector Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

Garden — DevOps Automation for Kubernetes Development and Testing

Contour — High-Performance Kubernetes Ingress Controller Using Envoy

tfsec — Static Security Scanner for Terraform Code