MCP ConfigsApr 6, 2026·2 min read

Turbopuffer MCP — Serverless Vector DB for AI Agents

MCP server for Turbopuffer serverless vector database. Sub-10ms search, zero ops, auto-scaling. Perfect for AI agent memory and RAG without managing infrastructure. 1,200+ stars.

MC
MCP Hub · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

Add to your .mcp.json:

{
  "mcpServers": {
    "turbopuffer": {
      "command": "npx",
      "args": ["-y", "@turbopuffer/mcp-server"],
      "env": {
        "TURBOPUFFER_API_KEY": "your-api-key"
      }
    }
  }
}

Get a free API key at turbopuffer.com. Restart Claude Code.


Intro

Turbopuffer MCP is a Model Context Protocol server for the Turbopuffer serverless vector database with 1,200+ GitHub stars. It gives AI agents like Claude Code instant access to sub-10ms vector search with zero infrastructure management — no database to deploy, no clusters to scale, no indexes to tune. Store embeddings, search semantically, and build RAG pipelines with natural language commands. Best for developers who want production vector search without DevOps overhead. Works with: Claude Code, Cursor, any MCP client. Setup time: under 1 minute.


Available Tools

tpuf_upsert

Store vectors with metadata:

"Store this document about authentication patterns with tags: auth, security"

tpuf_search

Semantic similarity search:

"Find the 5 most relevant documents about error handling"

tpuf_delete

Remove vectors:

"Delete all vectors tagged with 'deprecated'"

tpuf_list_namespaces

View all collections:

"Show all my vector namespaces and their sizes"

Why Turbopuffer

vs Self-Hosted Qdrant/Milvus

  • Zero ops: no Docker, no clusters, no backups
  • Auto-scaling: handles 10 to 10 million vectors
  • Pay per query, not per server hour

vs Pinecone

  • Open pricing, no hidden costs
  • Sub-10ms P99 latency
  • No pod management

Performance

Metric Value
Search latency (P50) 3ms
Search latency (P99) 8ms
Max vectors Unlimited
Dimensions Up to 4096
Concurrent queries Auto-scaled

Key Stats

  • 1,200+ GitHub stars
  • Sub-10ms search latency
  • Serverless — zero infrastructure
  • Auto-scaling from 0 to millions
  • Free tier available

FAQ

Q: What is Turbopuffer MCP? A: An MCP server that connects AI agents to Turbopuffer, a serverless vector database, for semantic search and RAG with zero infrastructure management.

Q: Is Turbopuffer free? A: Free tier available with generous limits. The MCP server itself is open-source.

Q: How is Turbopuffer different from Qdrant? A: Turbopuffer is fully serverless — no Docker, no cluster management. Qdrant requires self-hosting or a managed plan. Turbopuffer is ideal for teams that want zero ops.


🙏

Source & Thanks

Created by Turbopuffer. Licensed under MIT.

turbopuffer — ⭐ 1,200+

Thanks for making vector search as easy as a function call.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets