# Perplexity Sonar API — Search-Grounded LLM in One Call > Perplexity Sonar API returns LLM answers grounded in real-time web search with citations. Tiers: sonar / sonar-pro / sonar-reasoning. ## Install Copy the content below into your project: ## Quick Use 1. Get PPLX_API_KEY at perplexity.ai/settings/api 2. `OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)` 3. Use `model='sonar-pro'` and read `resp.citations` for source URLs --- ## Intro Perplexity's Sonar API is a one-call alternative to building search + scrape + chunk + RAG yourself — you send a question, Perplexity searches the web in real time and returns an LLM answer with inline numbered citations to the source URLs. Three tiers: `sonar` (fast/cheap), `sonar-pro` (better answer quality, more sources), `sonar-reasoning` (chain-of-thought, longer think time). Best for: news Q&A, fact-checking, anywhere you need a fresh answer with sources. Works with: OpenAI-compatible client (Python, JS), curl, LangChain. Setup time: 2 minutes. --- ### Python (openai-compatible) ```python from openai import OpenAI client = OpenAI( base_url="https://api.perplexity.ai", api_key=os.environ["PPLX_API_KEY"], ) resp = client.chat.completions.create( model="sonar-pro", messages=[{"role": "user", "content": "What are the top 3 AI funding rounds this week?"}], ) print(resp.choices[0].message.content) # Response includes inline citations like [1][2][3] # Read citation URLs separately print(resp.citations) # ["https://...", "https://...", "https://..."] ``` ### Filter sources by domain or recency ```python resp = client.chat.completions.create( model="sonar-pro", messages=[{"role": "user", "content": "What's the latest Anthropic announcement?"}], extra_body={ "search_domain_filter": ["anthropic.com", "techcrunch.com"], # whitelist "search_recency_filter": "week", # day | week | month | year "return_images": False, "return_related_questions": True, }, ) ``` ### Model tiers (May 2026) | Model | Use case | Cost ($/1M) | Latency | |---|---|---|---| | `sonar` | Quick lookups, single-source Q&A | $1 in / $1 out | ~1–3s | | `sonar-pro` | Production answer quality, multi-source | $3 in / $15 out | ~3–7s | | `sonar-reasoning` | Hard reasoning, citations + thinking | $1 in / $5 out | ~10–25s | | `sonar-reasoning-pro` | Top quality reasoning | $2 in / $8 out | ~15–40s | | `sonar-deep-research` | Long research reports with 30+ sources | $2 in / $8 out + per-search | ~minutes | ### When NOT to use Sonar If your data is private, not on the web, or in your own corpus — use a private RAG pipeline (e.g., Tavily + your vector store). Sonar searches public web only. --- ### FAQ **Q: Sonar vs Grok Live Search vs Tavily?** A: Grok bundles search into the same model call cheaply. Sonar gives stronger answer quality and richer citations. Tavily is search-only (you bring your own LLM). Use Sonar when answer quality matters; Tavily when you need control over the LLM stage. **Q: Are citations clickable?** A: Citations come back as a `citations` array of URLs separately from the markdown answer. Render them as numbered footnotes in your UI. Sonar's content also embeds `[1]`, `[2]` inline so you can map them visually. **Q: Rate limits?** A: Standard tier: ~50 RPM on sonar, ~20 RPM on sonar-pro. Higher tiers in console.perplexity.ai. For production scaling beyond, talk to Perplexity Sales — they offer dedicated capacity. --- ## Source & Thanks > Built by [Perplexity](https://github.com/perplexityai). Sonar API docs at [docs.perplexity.ai](https://docs.perplexity.ai). > > Official SDK pending; OpenAI-compatible client works today. --- ## 快速使用 1. 在 perplexity.ai/settings/api 拿 PPLX_API_KEY 2. `OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)` 3. 用 `model='sonar-pro'`,读 `resp.citations` 拿源 URL --- ## 简介 Perplexity 的 Sonar API 是自建 search + scrape + chunk + RAG 的一次性替代 —— 发问题,Perplexity 实时搜网然后返回带内联编号引用的 LLM 答案,引用指向源 URL。三档:`sonar`(快/便宜)、`sonar-pro`(答案质量更好、源更多)、`sonar-reasoning`(思维链、思考时间更长)。适合新闻问答、事实核查、任何需要带源的新鲜答案场景。兼容 OpenAI 兼容客户端(Python / JS)、curl、LangChain。装机时间 2 分钟。 --- ### Python(OpenAI 兼容) ```python from openai import OpenAI client = OpenAI( base_url="https://api.perplexity.ai", api_key=os.environ["PPLX_API_KEY"], ) resp = client.chat.completions.create( model="sonar-pro", messages=[{"role": "user", "content": "本周 AI 融资 Top 3 是哪些?"}], ) print(resp.choices[0].message.content) # 响应含内联引用如 [1][2][3] # 单独读引用 URL print(resp.citations) # ["https://...", "https://...", "https://..."] ``` ### 按域名或新鲜度筛源 ```python resp = client.chat.completions.create( model="sonar-pro", messages=[{"role": "user", "content": "Anthropic 最新发布是什么?"}], extra_body={ "search_domain_filter": ["anthropic.com", "techcrunch.com"], # 白名单 "search_recency_filter": "week", # day / week / month / year "return_images": False, "return_related_questions": True, }, ) ``` ### 模型档(2026 年 5 月) | 模型 | 用例 | 成本($/百万)| 延迟 | |---|---|---|---| | `sonar` | 快查、单源问答 | 输入 $1 / 输出 $1 | ~1–3 秒 | | `sonar-pro` | 生产答案质量、多源 | $3 / $15 | ~3–7 秒 | | `sonar-reasoning` | 难推理、带引用 + 思考 | $1 / $5 | ~10–25 秒 | | `sonar-reasoning-pro` | 顶级推理 | $2 / $8 | ~15–40 秒 | | `sonar-deep-research` | 长研究报告 30+ 源 | $2 / $8 + 单搜费 | 分钟级 | ### 什么时候不该用 Sonar 数据是私有、不在网上、或在你自己的语料里 —— 用私有 RAG 流水线(Tavily + 自己的向量库)。Sonar 只搜公网。 --- ### FAQ **Q: Sonar vs Grok Live Search vs Tavily?** A: Grok 把 search 打进同一次模型调用,便宜。Sonar 答案质量更强、引用更丰富。Tavily 只搜(自带 LLM)。答案质量重要选 Sonar;要控 LLM 阶段选 Tavily。 **Q: 引用可点击吗?** A: 引用作为 `citations` URL 数组单独回来,跟 markdown 答案分开。在 UI 里渲染成编号脚注。Sonar 内容也内联嵌 `[1]`、`[2]`,可以视觉映射。 **Q: 速率限制?** A: 标准档:sonar 约 50 RPM、sonar-pro 约 20 RPM。更高档在 console.perplexity.ai。要超出生产规模找 Perplexity 销售 —— 他们有专用容量。 --- ## 来源与感谢 > Built by [Perplexity](https://github.com/perplexityai). Sonar API docs at [docs.perplexity.ai](https://docs.perplexity.ai). > > Official SDK pending; OpenAI-compatible client works today. --- Source: https://tokrepo.com/en/workflows/perplexity-sonar-api-search-grounded-llm-in-one-call Author: Perplexity