# Faiss — Efficient Similarity Search and Clustering of Dense Vectors > Faiss is a library from Meta AI Research for efficient similarity search and clustering of dense vectors, optimized for billion-scale datasets with GPU acceleration. ## Install Save as a script file and run: # Faiss — Efficient Similarity Search and Clustering of Dense Vectors ## Quick Use ```bash pip install faiss-cpu python -c " import faiss, numpy as np d, n = 128, 10000 xb = np.random.random((n, d)).astype('float32') index = faiss.IndexFlatL2(d) index.add(xb) D, I = index.search(xb[:5], k=4) print(I) " ``` ## Introduction Faiss (Facebook AI Similarity Search) is a C++ library with Python bindings for nearest-neighbor search in high-dimensional spaces. Developed by Meta AI Research, it is the backbone of many production retrieval and RAG systems handling billions of vectors. ## What Faiss Does - Performs exact and approximate nearest-neighbor search on dense float vectors - Supports L2 distance, inner product, and other metrics - Offers dozens of index types: flat, IVF, HNSW, PQ, OPQ, and composites - Scales to billion-vector datasets using sharding, on-disk storage, and GPU acceleration - Provides k-means and PCA utilities for preprocessing and quantization training ## Architecture Overview At its core Faiss operates on flat C++ index objects that implement `add()`, `search()`, and `reconstruct()`. Composite indexes stack transformations (OPQ rotation, IVF coarse quantizer, PQ sub-quantization) via an index factory string like `"OPQ16,IVF4096,PQ16"`. GPU indexes mirror CPU counterparts using CUDA kernels for brute-force and IVF search. Python bindings are generated via SWIG, exposing the full C++ API. ## Self-Hosting & Configuration - CPU-only: `pip install faiss-cpu`; GPU: `pip install faiss-gpu` (requires CUDA) - No server process; it is an in-process library linked into your application - Build custom indexes with the index factory: `faiss.index_factory(dim, "IVF1024,PQ32")` - Train quantizers on a representative sample before adding the full dataset - Serialize indexes to disk with `faiss.write_index()` and load with `faiss.read_index()` ## Key Features - Handles billion-scale vector sets with sub-millisecond query latency - GPU implementation delivers 5-10x speedup over CPU for brute-force search - Composable index building blocks let you trade recall for speed and memory - Mature and battle-tested in production at Meta and many other organizations - Active maintenance with regular releases and thorough benchmarks ## Comparison with Similar Tools - **Milvus** — managed vector database with distributed architecture; Faiss is an embedded library - **Qdrant** — Rust-based vector DB with filtering; Faiss focuses on raw search speed - **Annoy (Spotify)** — simpler API, tree-based ANN; Faiss offers more index types and GPU support - **ScaNN (Google)** — similar scope with quantization-aware search; Faiss has broader adoption - **pgvector** — PostgreSQL extension for vector search; Faiss is standalone and faster at scale ## FAQ **Q: When should I use an approximate index instead of IndexFlatL2?** A: When your dataset exceeds a few hundred thousand vectors and exact search becomes too slow; IVF+PQ can cut latency by 100x with minimal recall loss. **Q: Can Faiss handle filtering (metadata predicates) during search?** A: Faiss supports an `IDSelector` mechanism, but for complex filtering most teams pair it with a metadata store or use a vector database built on Faiss. **Q: Is Faiss suitable for text embeddings and RAG?** A: Yes. Many RAG pipelines use Faiss as the vector index behind LangChain, LlamaIndex, and similar orchestration frameworks. **Q: What is the index factory string?** A: A compact description like `"IVF4096,PQ32"` that Faiss parses to build a composite index automatically. ## Sources - https://github.com/facebookresearch/faiss - https://faiss.ai/ --- Source: https://tokrepo.com/en/workflows/ce076c2f-3e25-11f1-9bc6-00163e2b0d79 Author: Script Depot