# Turbopuffer MCP — Serverless Vector DB for AI Agents

> MCP server for Turbopuffer serverless vector database. Sub-10ms search, zero ops, auto-scaling. Perfect for AI agent memory and RAG without managing infrastructure. 1,200+ stars.

## Install

Merge the JSON below into your `.mcp.json`:

## Quick Use

Add to your `.mcp.json`:

```json
{
  "mcpServers": {
    "turbopuffer": {
      "command": "npx",
      "args": ["-y", "@turbopuffer/mcp-server"],
      "env": {
        "TURBOPUFFER_API_KEY": "your-api-key"
      }
    }
  }
}
```

Get a free API key at [turbopuffer.com](https://turbopuffer.com). Restart Claude Code.

---

## Intro

Turbopuffer MCP is a Model Context Protocol server for the Turbopuffer serverless vector database with 1,200+ GitHub stars. It gives AI agents like Claude Code instant access to sub-10ms vector search with zero infrastructure management — no database to deploy, no clusters to scale, no indexes to tune. Store embeddings, search semantically, and build RAG pipelines with natural language commands. Best for developers who want production vector search without DevOps overhead. Works with: Claude Code, Cursor, any MCP client. Setup time: under 1 minute.

---

## Available Tools

### `tpuf_upsert`
Store vectors with metadata:
```
"Store this document about authentication patterns with tags: auth, security"
```

### `tpuf_search`
Semantic similarity search:
```
"Find the 5 most relevant documents about error handling"
```

### `tpuf_delete`
Remove vectors:
```
"Delete all vectors tagged with 'deprecated'"
```

### `tpuf_list_namespaces`
View all collections:
```
"Show all my vector namespaces and their sizes"
```

## Why Turbopuffer

### vs Self-Hosted Qdrant/Milvus
- Zero ops: no Docker, no clusters, no backups
- Auto-scaling: handles 10 to 10 million vectors
- Pay per query, not per server hour

### vs Pinecone
- Open pricing, no hidden costs
- Sub-10ms P99 latency
- No pod management

### Performance
| Metric | Value |
|--------|-------|
| Search latency (P50) | 3ms |
| Search latency (P99) | 8ms |
| Max vectors | Unlimited |
| Dimensions | Up to 4096 |
| Concurrent queries | Auto-scaled |

### Key Stats
- 1,200+ GitHub stars
- Sub-10ms search latency
- Serverless — zero infrastructure
- Auto-scaling from 0 to millions
- Free tier available

### FAQ

**Q: What is Turbopuffer MCP?**
A: An MCP server that connects AI agents to Turbopuffer, a serverless vector database, for semantic search and RAG with zero infrastructure management.

**Q: Is Turbopuffer free?**
A: Free tier available with generous limits. The MCP server itself is open-source.

**Q: How is Turbopuffer different from Qdrant?**
A: Turbopuffer is fully serverless — no Docker, no cluster management. Qdrant requires self-hosting or a managed plan. Turbopuffer is ideal for teams that want zero ops.

---

## Source & Thanks

> Created by [Turbopuffer](https://github.com/turbopuffer). Licensed under MIT.
>
> [turbopuffer](https://github.com/turbopuffer/turbopuffer) — ⭐ 1,200+

Thanks for making vector search as easy as a function call.

---

<!-- ZH -->


## Quick Use

Add the following to `.mcp.json`:

```json
{
  "mcpServers": {
    "turbopuffer": {
      "command": "npx",
      "args": ["-y", "@turbopuffer/mcp-server"],
      "env": {
        "TURBOPUFFER_API_KEY": "your-key"
      }
    }
  }
}
```

---

## Intro

Turbopuffer MCP is the MCP server for Turbopuffer's serverless vector database, with 1,200+ GitHub stars. Sub-10ms vector search, zero ops, auto-scaling. Ideal for developers who want vector search and RAG without managing infrastructure.

---

## Source & Thanks

> Created by [Turbopuffer](https://github.com/turbopuffer). Licensed under MIT.
>
> [turbopuffer](https://github.com/turbopuffer/turbopuffer) — ⭐ 1,200+


---
Source: https://tokrepo.com/en/workflows/turbopuffer-mcp-serverless-vector-db-ai-agents-2a5c2700
Author: MCP Hub