Scripts2026年4月15日·1 分钟阅读

pgvector — Vector Similarity Search Inside PostgreSQL

A PostgreSQL extension that adds a native `vector` type, HNSW and IVFFlat indexes, and distance operators so semantic search, RAG and recommendation workloads can reuse the same database as the rest of the app.

Introduction

pgvector brings dense-vector storage, indexing and distance functions to PostgreSQL, turning your existing SQL database into a first-class vector store. You keep transactions, joins, row-level security and backups while adding semantic search, RAG retrieval or recommendation ranking — no separate vector DB to operate.

What pgvector Does

  • Adds the vector, halfvec, sparsevec and bit types for dense/sparse embeddings.
  • Ships distance operators: <-> L2, <=> cosine, <#> inner product, <+> L1.
  • Builds approximate indexes with HNSW (high recall) or IVFFlat (low memory).
  • Supports exact nearest-neighbour search when you need 100% recall on small sets.
  • Integrates with every PG client — LangChain, LlamaIndex, Django, Rails, Drizzle all speak pgvector.

Architecture Overview

Vectors are stored as fixed-size float4 arrays inside a regular Postgres heap, so MVCC, WAL and streaming replication Just Work. HNSW indexes store a layered proximity graph in shared buffers; IVFFlat builds k-means centroids and stores vectors in posting lists. Queries use Postgres's executor, so you can combine ORDER BY embedding <=> $1 with WHERE filters, joins and pagination in a single plan.

Self-Hosting & Configuration

  • Install via apt, yum, Homebrew, Docker (pgvector/pgvector:pg16), or compile from source.
  • Tune maintenance_work_mem (4–8 GB) before building HNSW indexes on large tables.
  • For HNSW: raise hnsw.ef_search at query time for higher recall; tune m/ef_construction at build time.
  • For IVFFlat: set lists ≈ √rows; keep probes small at first, raise until recall meets target.
  • Use halfvec(768) to cut index size in half when 16-bit precision is enough.

Key Features

  • Works inside transactions — inserts and embeddings commit atomically with other rows.
  • Filtered ANN: combine WHERE tenant_id = ? with <=> using partial or composite indexes.
  • Parallel index builds and query execution on multi-core machines.
  • Exact search with no index for tiny datasets or unit tests.
  • Hybrid search with tsvector / ts_rank in the same query.

Comparison with Similar Tools

  • Qdrant / Weaviate / Milvus — dedicated vector DBs, richer filtering DSL, extra infra to run.
  • Pinecone — managed service with its own API and no SQL story.
  • Redis VSS — fast in-memory ANN; lacks transactions and SQL joins.
  • Elasticsearch kNN — good if you already run ES; weaker transactional semantics.
  • SQLite sqlite-vec — nice for edge, but single-writer and no concurrent MVCC.

FAQ

Q: How many vectors can pgvector handle? A: Hundreds of millions with HNSW on a beefy node; shard with Citus or partitioning beyond that. Q: HNSW or IVFFlat? A: HNSW for recall and low latency at query time; IVFFlat when index size and build speed matter more. Q: Do I need to normalise vectors? A: For cosine yes, or use vector_cosine_ops which normalises internally. Q: Is pgvector supported on managed Postgres? A: Yes — AWS RDS/Aurora, Google Cloud SQL, Azure, Supabase, Neon and Crunchy Bridge all enable it.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产