# Perplexity Sonar API — Search-Grounded LLM in One Call

> Perplexity Sonar API returns LLM answers grounded in real-time web search with citations. Tiers: sonar / sonar-pro / sonar-reasoning.

## Install

Copy the content below into your project:

## Quick Use

1. Get PPLX_API_KEY at perplexity.ai/settings/api
2. `OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)`
3. Use `model='sonar-pro'` and read `resp.citations` for source URLs

---

## Intro

Perplexity's Sonar API is a one-call alternative to building search + scrape + chunk + RAG yourself — you send a question, Perplexity searches the web in real time and returns an LLM answer with inline numbered citations to the source URLs. Three tiers: `sonar` (fast/cheap), `sonar-pro` (better answer quality, more sources), `sonar-reasoning` (chain-of-thought, longer think time). Best for: news Q&A, fact-checking, anywhere you need a fresh answer with sources. Works with: OpenAI-compatible client (Python, JS), curl, LangChain. Setup time: 2 minutes.

---

### Python (openai-compatible)

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.perplexity.ai",
    api_key=os.environ["PPLX_API_KEY"],
)

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What are the top 3 AI funding rounds this week?"}],
)
print(resp.choices[0].message.content)
# Response includes inline citations like [1][2][3]

# Read citation URLs separately
print(resp.citations)   # ["https://...", "https://...", "https://..."]
```

### Filter sources by domain or recency

```python
resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What's the latest Anthropic announcement?"}],
    extra_body={
        "search_domain_filter": ["anthropic.com", "techcrunch.com"],   # whitelist
        "search_recency_filter": "week",                                # day | week | month | year
        "return_images": False,
        "return_related_questions": True,
    },
)
```

### Model tiers (May 2026)

| Model | Use case | Cost ($/1M) | Latency |
|---|---|---|---|
| `sonar` | Quick lookups, single-source Q&A | $1 in / $1 out | ~1–3s |
| `sonar-pro` | Production answer quality, multi-source | $3 in / $15 out | ~3–7s |
| `sonar-reasoning` | Hard reasoning, citations + thinking | $1 in / $5 out | ~10–25s |
| `sonar-reasoning-pro` | Top quality reasoning | $2 in / $8 out | ~15–40s |
| `sonar-deep-research` | Long research reports with 30+ sources | $2 in / $8 out + per-search | ~minutes |

### When NOT to use Sonar

If your data is private, not on the web, or in your own corpus — use a private RAG pipeline (e.g., Tavily + your vector store). Sonar searches public web only.

---

### FAQ

**Q: Sonar vs Grok Live Search vs Tavily?**
A: Grok bundles search into the same model call cheaply. Sonar gives stronger answer quality and richer citations. Tavily is search-only (you bring your own LLM). Use Sonar when answer quality matters; Tavily when you need control over the LLM stage.

**Q: Are citations clickable?**
A: Citations come back as a `citations` array of URLs separately from the markdown answer. Render them as numbered footnotes in your UI. Sonar's content also embeds `[1]`, `[2]` inline so you can map them visually.

**Q: Rate limits?**
A: Standard tier: ~50 RPM on sonar, ~20 RPM on sonar-pro. Higher tiers in console.perplexity.ai. For production scaling beyond, talk to Perplexity Sales — they offer dedicated capacity.

---

## Source & Thanks

> Built by [Perplexity](https://github.com/perplexityai). Sonar API docs at [docs.perplexity.ai](https://docs.perplexity.ai).
>
> Official SDK pending; OpenAI-compatible client works today.

---

<!-- ZH -->

## 快速使用

1. 在 perplexity.ai/settings/api 拿 PPLX_API_KEY
2. `OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)`
3. 用 `model='sonar-pro'`，读 `resp.citations` 拿源 URL

---

## 简介

Perplexity 的 Sonar API 是自建 search + scrape + chunk + RAG 的一次性替代 —— 发问题，Perplexity 实时搜网然后返回带内联编号引用的 LLM 答案，引用指向源 URL。三档：`sonar`（快/便宜）、`sonar-pro`（答案质量更好、源更多）、`sonar-reasoning`（思维链、思考时间更长）。适合新闻问答、事实核查、任何需要带源的新鲜答案场景。兼容 OpenAI 兼容客户端（Python / JS）、curl、LangChain。装机时间 2 分钟。

---

### Python（OpenAI 兼容）

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.perplexity.ai",
    api_key=os.environ["PPLX_API_KEY"],
)

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "本周 AI 融资 Top 3 是哪些？"}],
)
print(resp.choices[0].message.content)
# 响应含内联引用如 [1][2][3]

# 单独读引用 URL
print(resp.citations)   # ["https://...", "https://...", "https://..."]
```

### 按域名或新鲜度筛源

```python
resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Anthropic 最新发布是什么？"}],
    extra_body={
        "search_domain_filter": ["anthropic.com", "techcrunch.com"],   # 白名单
        "search_recency_filter": "week",                                # day / week / month / year
        "return_images": False,
        "return_related_questions": True,
    },
)
```

### 模型档（2026 年 5 月）

| 模型 | 用例 | 成本（$/百万）| 延迟 |
|---|---|---|---|
| `sonar` | 快查、单源问答 | 输入 $1 / 输出 $1 | ~1–3 秒 |
| `sonar-pro` | 生产答案质量、多源 | $3 / $15 | ~3–7 秒 |
| `sonar-reasoning` | 难推理、带引用 + 思考 | $1 / $5 | ~10–25 秒 |
| `sonar-reasoning-pro` | 顶级推理 | $2 / $8 | ~15–40 秒 |
| `sonar-deep-research` | 长研究报告 30+ 源 | $2 / $8 + 单搜费 | 分钟级 |

### 什么时候不该用 Sonar

数据是私有、不在网上、或在你自己的语料里 —— 用私有 RAG 流水线（Tavily + 自己的向量库）。Sonar 只搜公网。

---

### FAQ

**Q: Sonar vs Grok Live Search vs Tavily？**
A: Grok 把 search 打进同一次模型调用，便宜。Sonar 答案质量更强、引用更丰富。Tavily 只搜（自带 LLM）。答案质量重要选 Sonar；要控 LLM 阶段选 Tavily。

**Q: 引用可点击吗？**
A: 引用作为 `citations` URL 数组单独回来，跟 markdown 答案分开。在 UI 里渲染成编号脚注。Sonar 内容也内联嵌 `[1]`、`[2]`，可以视觉映射。

**Q: 速率限制？**
A: 标准档：sonar 约 50 RPM、sonar-pro 约 20 RPM。更高档在 console.perplexity.ai。要超出生产规模找 Perplexity 销售 —— 他们有专用容量。

---

## 来源与感谢

> Built by [Perplexity](https://github.com/perplexityai). Sonar API docs at [docs.perplexity.ai](https://docs.perplexity.ai).
>
> Official SDK pending; OpenAI-compatible client works today.


---
Source: https://tokrepo.com/en/workflows/perplexity-sonar-api-search-grounded-llm-in-one-call
Author: Perplexity