What is Turbopuffer — Serverless Vector DB for AI Search?

Serverless vector database built for AI search at scale. Turbopuffer offers sub-millisecond queries, automatic scaling, and pay-per-query pricing with zero infrastructure.

Is Turbopuffer — Serverless Vector DB for AI Search free to use?

Yes. Turbopuffer — Serverless Vector DB for AI Search is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Turbopuffer — Serverless Vector DB for AI Search?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Turbopuffer — Serverless Vector DB for AI Search

What is Turbopuffer?

Turbopuffer is a serverless vector database designed for AI search workloads. It stores embeddings and serves similarity queries with sub-millisecond latency at any scale. Unlike self-hosted vector databases, Turbopuffer requires zero infrastructure — just an API key. Pay only for what you query, with automatic scaling from zero to billions of vectors.

Answer-Ready: Turbopuffer is a serverless vector database for AI search. Sub-millisecond queries, automatic scaling, pay-per-query pricing. No infrastructure to manage. Supports filtering, hybrid search, and namespaces. Used by AI companies for production RAG. Backed by a]16z.

Best for: AI teams building RAG or semantic search without managing infrastructure. Works with: OpenAI embeddings, Cohere, any embedding model. Setup time: Under 1 minute.

Core Features

1. Serverless (Zero Ops)

No clusters, no replicas, no shards. Create a namespace and start querying:

ns = tpuf.Namespace("products")
ns.upsert(ids=[1], vectors=[[...]], attributes={"name": ["Widget"]})
# That's it. No provisioning.

2. Attribute Filtering

results = ns.query(
    vector=[...],
    top_k=10,
    filters={"category": ["electronics"], "price": {"$lt": 100}},
)

3. Hybrid Search

# Combine vector similarity with BM25 text search
results = ns.query(
    vector=[...],
    top_k=10,
    rank_by=["vector_distance", "bm25"],
)

4. Performance

Metric	Value
Query latency (p50)	<1ms
Query latency (p99)	<10ms
Max vectors	Billions
Dimensions	Up to 4096

Turbopuffer vs Alternatives

Feature	Turbopuffer	Pinecone	Qdrant	Weaviate
Serverless	Yes	Yes (paid)	No	No
Pricing	Per query	Per pod/hour	Free (OSS)	Free (OSS)
Scale to zero	Yes	No	N/A	N/A
Self-hosted	No	No	Yes	Yes
Latency	<1ms	~10ms	~5ms	~5ms

FAQ

Q: How does pricing work? A: Pay per query and storage. No minimum spend. Scales to zero when not in use — ideal for variable workloads.

Q: Can I migrate from Pinecone? A: Yes, export vectors from Pinecone and upsert into Turbopuffer. The API is similar.

Q: Does it support metadata filtering? A: Yes, filter on any attribute with comparison operators ($eq, $lt, $gt, $in, etc.).

Turbopuffer — Serverless Vector DB for AI Search

Use it first, then decide how deep to go

What is Turbopuffer?

Core Features

1. Serverless (Zero Ops)

2. Attribute Filtering

3. Hybrid Search

4. Performance

Turbopuffer vs Alternatives

FAQ

Source & Thanks

Discussion

Related Assets

Qdrant — Vector Search Engine for AI Applications