# Faiss — Efficient Similarity Search and Clustering of Dense Vectors

> Faiss is a library from Meta AI Research for efficient similarity search and clustering of dense vectors, optimized for billion-scale datasets with GPU acceleration.

## Install

Save as a script file and run:

# Faiss — Efficient Similarity Search and Clustering of Dense Vectors

## Quick Use
```bash
pip install faiss-cpu
python -c "
import faiss, numpy as np
d, n = 128, 10000
xb = np.random.random((n, d)).astype('float32')
index = faiss.IndexFlatL2(d)
index.add(xb)
D, I = index.search(xb[:5], k=4)
print(I)
"
```

## Introduction
Faiss (Facebook AI Similarity Search) is a C++ library with Python bindings for nearest-neighbor search in high-dimensional spaces. Developed by Meta AI Research, it is the backbone of many production retrieval and RAG systems handling billions of vectors.

## What Faiss Does
- Performs exact and approximate nearest-neighbor search on dense float vectors
- Supports L2 distance, inner product, and other metrics
- Offers dozens of index types: flat, IVF, HNSW, PQ, OPQ, and composites
- Scales to billion-vector datasets using sharding, on-disk storage, and GPU acceleration
- Provides k-means and PCA utilities for preprocessing and quantization training

## Architecture Overview
At its core Faiss operates on flat C++ index objects that implement `add()`, `search()`, and `reconstruct()`. Composite indexes stack transformations (OPQ rotation, IVF coarse quantizer, PQ sub-quantization) via an index factory string like `"OPQ16,IVF4096,PQ16"`. GPU indexes mirror CPU counterparts using CUDA kernels for brute-force and IVF search. Python bindings are generated via SWIG, exposing the full C++ API.

## Self-Hosting & Configuration
- CPU-only: `pip install faiss-cpu`; GPU: `pip install faiss-gpu` (requires CUDA)
- No server process; it is an in-process library linked into your application
- Build custom indexes with the index factory: `faiss.index_factory(dim, "IVF1024,PQ32")`
- Train quantizers on a representative sample before adding the full dataset
- Serialize indexes to disk with `faiss.write_index()` and load with `faiss.read_index()`

## Key Features
- Handles billion-scale vector sets with sub-millisecond query latency
- GPU implementation delivers 5-10x speedup over CPU for brute-force search
- Composable index building blocks let you trade recall for speed and memory
- Mature and battle-tested in production at Meta and many other organizations
- Active maintenance with regular releases and thorough benchmarks

## Comparison with Similar Tools
- **Milvus** — managed vector database with distributed architecture; Faiss is an embedded library
- **Qdrant** — Rust-based vector DB with filtering; Faiss focuses on raw search speed
- **Annoy (Spotify)** — simpler API, tree-based ANN; Faiss offers more index types and GPU support
- **ScaNN (Google)** — similar scope with quantization-aware search; Faiss has broader adoption
- **pgvector** — PostgreSQL extension for vector search; Faiss is standalone and faster at scale

## FAQ
**Q: When should I use an approximate index instead of IndexFlatL2?**
A: When your dataset exceeds a few hundred thousand vectors and exact search becomes too slow; IVF+PQ can cut latency by 100x with minimal recall loss.

**Q: Can Faiss handle filtering (metadata predicates) during search?**
A: Faiss supports an `IDSelector` mechanism, but for complex filtering most teams pair it with a metadata store or use a vector database built on Faiss.

**Q: Is Faiss suitable for text embeddings and RAG?**
A: Yes. Many RAG pipelines use Faiss as the vector index behind LangChain, LlamaIndex, and similar orchestration frameworks.

**Q: What is the index factory string?**
A: A compact description like `"IVF4096,PQ32"` that Faiss parses to build a composite index automatically.

## Sources
- https://github.com/facebookresearch/faiss
- https://faiss.ai/

---
Source: https://tokrepo.com/en/workflows/ce076c2f-3e25-11f1-9bc6-00163e2b0d79
Author: Script Depot