Quick Use
- Get PPLX_API_KEY at perplexity.ai/settings/api
OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)- Use
model='sonar-pro'and readresp.citationsfor source URLs
Intro
Perplexity's Sonar API is a one-call alternative to building search + scrape + chunk + RAG yourself — you send a question, Perplexity searches the web in real time and returns an LLM answer with inline numbered citations to the source URLs. Three tiers: sonar (fast/cheap), sonar-pro (better answer quality, more sources), sonar-reasoning (chain-of-thought, longer think time). Best for: news Q&A, fact-checking, anywhere you need a fresh answer with sources. Works with: OpenAI-compatible client (Python, JS), curl, LangChain. Setup time: 2 minutes.
Python (openai-compatible)
from openai import OpenAI
client = OpenAI(
base_url="https://api.perplexity.ai",
api_key=os.environ["PPLX_API_KEY"],
)
resp = client.chat.completions.create(
model="sonar-pro",
messages=[{"role": "user", "content": "What are the top 3 AI funding rounds this week?"}],
)
print(resp.choices[0].message.content)
# Response includes inline citations like [1][2][3]
# Read citation URLs separately
print(resp.citations) # ["https://...", "https://...", "https://..."]Filter sources by domain or recency
resp = client.chat.completions.create(
model="sonar-pro",
messages=[{"role": "user", "content": "What's the latest Anthropic announcement?"}],
extra_body={
"search_domain_filter": ["anthropic.com", "techcrunch.com"], # whitelist
"search_recency_filter": "week", # day | week | month | year
"return_images": False,
"return_related_questions": True,
},
)Model tiers (May 2026)
| Model | Use case | Cost ($/1M) | Latency |
|---|---|---|---|
sonar |
Quick lookups, single-source Q&A | $1 in / $1 out | ~1–3s |
sonar-pro |
Production answer quality, multi-source | $3 in / $15 out | ~3–7s |
sonar-reasoning |
Hard reasoning, citations + thinking | $1 in / $5 out | ~10–25s |
sonar-reasoning-pro |
Top quality reasoning | $2 in / $8 out | ~15–40s |
sonar-deep-research |
Long research reports with 30+ sources | $2 in / $8 out + per-search | ~minutes |
When NOT to use Sonar
If your data is private, not on the web, or in your own corpus — use a private RAG pipeline (e.g., Tavily + your vector store). Sonar searches public web only.
FAQ
Q: Sonar vs Grok Live Search vs Tavily? A: Grok bundles search into the same model call cheaply. Sonar gives stronger answer quality and richer citations. Tavily is search-only (you bring your own LLM). Use Sonar when answer quality matters; Tavily when you need control over the LLM stage.
Q: Are citations clickable?
A: Citations come back as a citations array of URLs separately from the markdown answer. Render them as numbered footnotes in your UI. Sonar's content also embeds [1], [2] inline so you can map them visually.
Q: Rate limits? A: Standard tier: ~50 RPM on sonar, ~20 RPM on sonar-pro. Higher tiers in console.perplexity.ai. For production scaling beyond, talk to Perplexity Sales — they offer dedicated capacity.
Source & Thanks
Built by Perplexity. Sonar API docs at docs.perplexity.ai.
Official SDK pending; OpenAI-compatible client works today.