# Turbopuffer — Serverless Vector DB for AI Search > Serverless vector database built for AI search at scale. Turbopuffer offers sub-millisecond queries, automatic scaling, and pay-per-query pricing with zero infrastructure. ## Install Copy the content below into your project: ## Quick Use ```bash pip install turbopuffer ``` ```python import turbopuffer as tpuf # Connect (serverless — no infrastructure to manage) tpuf.api_key = "tbp_..." # Create namespace and upsert vectors ns = tpuf.Namespace("my-docs") ns.upsert( ids=[1, 2, 3], vectors=[[0.1, 0.2, ...], [0.3, 0.4, ...], [0.5, 0.6, ...]], attributes={"title": ["Doc A", "Doc B", "Doc C"]}, ) # Query results = ns.query( vector=[0.15, 0.25, ...], top_k=10, include_attributes=["title"], ) for r in results: print(f"{r.id}: {r.attributes['title']} (score: {r.dist:.4f})") ``` ## What is Turbopuffer? Turbopuffer is a serverless vector database designed for AI search workloads. It stores embeddings and serves similarity queries with sub-millisecond latency at any scale. Unlike self-hosted vector databases, Turbopuffer requires zero infrastructure — just an API key. Pay only for what you query, with automatic scaling from zero to billions of vectors. **Answer-Ready**: Turbopuffer is a serverless vector database for AI search. Sub-millisecond queries, automatic scaling, pay-per-query pricing. No infrastructure to manage. Supports filtering, hybrid search, and namespaces. Used by AI companies for production RAG. Backed by a]16z. **Best for**: AI teams building RAG or semantic search without managing infrastructure. **Works with**: OpenAI embeddings, Cohere, any embedding model. **Setup time**: Under 1 minute. ## Core Features ### 1. Serverless (Zero Ops) No clusters, no replicas, no shards. Create a namespace and start querying: ```python ns = tpuf.Namespace("products") ns.upsert(ids=[1], vectors=[[...]], attributes={"name": ["Widget"]}) # That's it. No provisioning. ``` ### 2. Attribute Filtering ```python results = ns.query( vector=[...], top_k=10, filters={"category": ["electronics"], "price": {"$lt": 100}}, ) ``` ### 3. Hybrid Search ```python # Combine vector similarity with BM25 text search results = ns.query( vector=[...], top_k=10, rank_by=["vector_distance", "bm25"], ) ``` ### 4. Performance | Metric | Value | |--------|-------| | Query latency (p50) | <1ms | | Query latency (p99) | <10ms | | Max vectors | Billions | | Dimensions | Up to 4096 | ## Turbopuffer vs Alternatives | Feature | Turbopuffer | Pinecone | Qdrant | Weaviate | |---------|-------------|----------|--------|----------| | Serverless | Yes | Yes (paid) | No | No | | Pricing | Per query | Per pod/hour | Free (OSS) | Free (OSS) | | Scale to zero | Yes | No | N/A | N/A | | Self-hosted | No | No | Yes | Yes | | Latency | <1ms | ~10ms | ~5ms | ~5ms | ## FAQ **Q: How does pricing work?** A: Pay per query and storage. No minimum spend. Scales to zero when not in use — ideal for variable workloads. **Q: Can I migrate from Pinecone?** A: Yes, export vectors from Pinecone and upsert into Turbopuffer. The API is similar. **Q: Does it support metadata filtering?** A: Yes, filter on any attribute with comparison operators ($eq, $lt, $gt, $in, etc.). ## Source & Thanks > Created by [Turbopuffer](https://turbopuffer.com). Backed by a16z. > > [turbopuffer.com](https://turbopuffer.com) — Serverless vector database ## 快速使用 ```bash pip install turbopuffer ``` 三行代码接入 Serverless 向量数据库,无需管理基础设施。 ## 什么是 Turbopuffer? Turbopuffer 是 Serverless 向量数据库,亚毫秒查询延迟,自动扩缩容,按查询付费。无需管理集群和副本。 **一句话总结**:Serverless 向量数据库,亚毫秒查询,自动扩缩容到零,按查询付费无最低消费,a16z 投资。 **适合人群**:构建 RAG 或语义搜索的 AI 团队。 ## 核心功能 ### 1. 零运维 无集群、无副本,API Key 直接用。 ### 2. 属性过滤 支持条件过滤($eq/$lt/$gt/$in)。 ### 3. 混合搜索 向量相似度 + BM25 文本搜索。 ## 常见问题 **Q: 怎么收费?** A: 按查询和存储付费,不用不花钱。 **Q: 能从 Pinecone 迁移?** A: 可以,导出向量后 upsert 到 Turbopuffer。 ## 来源与致谢 > [turbopuffer.com](https://turbopuffer.com) — Serverless 向量数据库,a16z 投资 --- Source: https://tokrepo.com/en/workflows/cad8d8d1-de90-4850-a9a0-c723dfdf40a0 Author: AI Open Source