Available Tools
tpuf_upsert
Store vectors with metadata:
"Store this document about authentication patterns with tags: auth, security"tpuf_search
Semantic similarity search:
"Find the 5 most relevant documents about error handling"tpuf_delete
Remove vectors:
"Delete all vectors tagged with 'deprecated'"tpuf_list_namespaces
View all collections:
"Show all my vector namespaces and their sizes"Why Turbopuffer
vs Self-Hosted Qdrant/Milvus
- Zero ops: no Docker, no clusters, no backups
- Auto-scaling: handles 10 to 10 million vectors
- Pay per query, not per server hour
vs Pinecone
- Open pricing, no hidden costs
- Sub-10ms P99 latency
- No pod management
Performance
| Metric | Value |
|---|---|
| Search latency (P50) | 3ms |
| Search latency (P99) | 8ms |
| Max vectors | Unlimited |
| Dimensions | Up to 4096 |
| Concurrent queries | Auto-scaled |
Key Stats
- 1,200+ GitHub stars
- Sub-10ms search latency
- Serverless — zero infrastructure
- Auto-scaling from 0 to millions
- Free tier available
FAQ
Q: What is Turbopuffer MCP? A: An MCP server that connects AI agents to Turbopuffer, a serverless vector database, for semantic search and RAG with zero infrastructure management.
Q: Is Turbopuffer free? A: Free tier available with generous limits. The MCP server itself is open-source.
Q: How is Turbopuffer different from Qdrant? A: Turbopuffer is fully serverless — no Docker, no cluster management. Qdrant requires self-hosting or a managed plan. Turbopuffer is ideal for teams that want zero ops.