How do I install Pinecone Assistant — Managed RAG Service with Auto-Indexing?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Pinecone Assistant — Managed RAG Service with Auto-Indexing

Name: Pinecone Assistant — Managed RAG Service with Auto-Indexing
Author: Pinecone

简介

Pinecone Assistant 是完全托管的 RAG 产品 —— 上传 PDF、Word、文本，得到一个带引用的聊天端点。Pinecone 帮你做切分、embedding、检索、prompt 构造、引用渲染。适合想要给文档做 RAG 又不想自己搭切分 + embedding + prompt 构造层的团队。兼容 Pinecone Python / Node SDK / REST API / Pinecone Console。装机时间 5 分钟。

建一个 assistant + 上传文件

from pinecone import Pinecone

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

# 创建
assistant = pc.assistant.create_assistant(
    assistant_name="acme-docs",
    instructions="You are an Acme product support assistant. Cite sources.",
)

# 上传文件
assistant.upload_file(file_path="./manual.pdf")
assistant.upload_file(file_path="./faq.md")
assistant.upload_file(file_path="./troubleshooting.docx")

Pinecone 自动切分每个文档、embedding chunk、存进隐藏的向量索引、索引 metadata。

带引用的聊天

from pinecone_plugins.assistant.models.chat import Message

messages = [Message(role="user", content="How do I reset the device?")]
response = assistant.chat(messages=messages, model="claude-3-5-sonnet")

print(response.message.content)
# "To reset the device, hold the power button for 10 seconds [1]. After the
#  light blinks blue, release. The device will return to factory settings [2]."

for citation in response.citations:
    print(citation.references[0].file.name, citation.references[0].pages)
# manual.pdf [page 12]
# manual.pdf [page 13]

流式响应

for chunk in assistant.chat_stream(messages=messages):
    print(chunk.message.content, end="", flush=True)

按 metadata 过滤检索

# 上传时打 tag
assistant.upload_file(
    file_path="./internal-only.pdf",
    metadata={"audience": "internal", "version": "2.0"},
)

# 查询时过滤
response = assistant.chat(
    messages=messages,
    filter={"audience": {"$eq": "public"}},
)

什么时候用 Assistant vs 自己撸

用 Assistant	自己撸
想 1 小时内 RAG 跑起来	需要完全控制切分策略
Pinecone 切分够用	专业文档（法律、医疗）
几百 MB 文档	TB 级语料
要开箱即用的引用回答	自定义 prompt + 引用格式

FAQ

Q: Pinecone Assistant 免费吗？ A: 有免费档（2 个 assistant，查询数限制）。付费档加更多查询和存储。底层 LLM（Claude / GPT）成本由 Pinecone 计费，比直接用略加价。

Q: Assistant 能用哪些 LLM？ A: GPT-4o、Claude 3.5 Sonnet 和 Pinecone 持续加的其他模型。聊天时通过 model= 选择。Pinecone 处理 API key + 路由。

Q: 跟自建 RAG 加 Pinecone 索引啥区别？ A: 自建 RAG：你自己搭切分、embedding、检索、prompt 构造、引用。Assistant：Pinecone 都搭好，露一个 chat() 端点。80% 的用例 Assistant 上手更快；长尾自定义需求自建。

Pinecone Assistant — Managed RAG Service with Auto-Indexing

这个资产可以被 Agent 直接读取和安装

简介

建一个 assistant + 上传文件

带引用的聊天

流式响应

按 metadata 过滤检索

什么时候用 Assistant vs 自己撸

FAQ

来源与感谢

讨论

相关资产

Pinecone — Managed Vector Database for Production AI

Pinecone Inference — Hosted Embeddings & Reranking API

Cohere Rerank — Boost RAG Accuracy with Rerank-3

OpenRouter Auto Routing — Pick the Best Model per Query