What is Perplexity Sonar API — Search-Grounded LLM in One Call?

Perplexity Sonar API returns LLM answers grounded in real-time web search with citations. Tiers: sonar / sonar-pro / sonar-reasoning.

Is Perplexity Sonar API — Search-Grounded LLM in One Call free to use?

Yes. Perplexity Sonar API — Search-Grounded LLM in One Call is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Perplexity Sonar API — Search-Grounded LLM in One Call?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Perplexity Sonar API — Search-Grounded LLM in One Call

Name: Perplexity Sonar API — Search-Grounded LLM in One Call
Author: Perplexity

简介

Perplexity 的 Sonar API 是自建 search + scrape + chunk + RAG 的一次性替代 —— 发问题，Perplexity 实时搜网然后返回带内联编号引用的 LLM 答案，引用指向源 URL。三档：sonar（快/便宜）、sonar-pro（答案质量更好、源更多）、sonar-reasoning（思维链、思考时间更长）。适合新闻问答、事实核查、任何需要带源的新鲜答案场景。兼容 OpenAI 兼容客户端（Python / JS）、curl、LangChain。装机时间 2 分钟。

Python（OpenAI 兼容）

from openai import OpenAI

client = OpenAI(
    base_url="https://api.perplexity.ai",
    api_key=os.environ["PPLX_API_KEY"],
)

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "本周 AI 融资 Top 3 是哪些？"}],
)
print(resp.choices[0].message.content)
# 响应含内联引用如 [1][2][3]

# 单独读引用 URL
print(resp.citations)   # ["https://...", "https://...", "https://..."]

按域名或新鲜度筛源

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Anthropic 最新发布是什么？"}],
    extra_body={
        "search_domain_filter": ["anthropic.com", "techcrunch.com"],   # 白名单
        "search_recency_filter": "week",                                # day / week / month / year
        "return_images": False,
        "return_related_questions": True,
    },
)

模型档（2026 年 5 月）

模型	用例	成本（$/百万）	延迟
`sonar`	快查、单源问答	输入 $1 / 输出 $1	~1–3 秒
`sonar-pro`	生产答案质量、多源	$3 / $15	~3–7 秒
`sonar-reasoning`	难推理、带引用 + 思考	$1 / $5	~10–25 秒
`sonar-reasoning-pro`	顶级推理	$2 / $8	~15–40 秒
`sonar-deep-research`	长研究报告 30+ 源	$2 / $8 + 单搜费	分钟级

什么时候不该用 Sonar

数据是私有、不在网上、或在你自己的语料里 —— 用私有 RAG 流水线（Tavily + 自己的向量库）。Sonar 只搜公网。

FAQ

Q: Sonar vs Grok Live Search vs Tavily？ A: Grok 把 search 打进同一次模型调用，便宜。Sonar 答案质量更强、引用更丰富。Tavily 只搜（自带 LLM）。答案质量重要选 Sonar；要控 LLM 阶段选 Tavily。

Q: 引用可点击吗？ A: 引用作为 citations URL 数组单独回来，跟 markdown 答案分开。在 UI 里渲染成编号脚注。Sonar 内容也内联嵌 [1]、[2]，可以视觉映射。

Q: 速率限制？ A: 标准档：sonar 约 50 RPM、sonar-pro 约 20 RPM。更高档在 console.perplexity.ai。要超出生产规模找 Perplexity 销售 —— 他们有专用容量。

Perplexity Sonar API — Search-Grounded LLM in One Call

这个资产可以被 Agent 直接读取和安装

简介

Python（OpenAI 兼容）

按域名或新鲜度筛源

模型档（2026 年 5 月）

什么时候不该用 Sonar

FAQ

来源与感谢

讨论

相关资产

Weave — Trace and Debug LLM Apps

Helicone Sessions — Group LLM Calls by User Conversation

Helicone Cache — Cut LLM Spend with Drop-In Response Caching

Statewave — Memory Runtime for AI Agents (API + SDKs)