Is Perplexity Sonar API — Search-Grounded LLM in One Call free to use?

Yes. Perplexity Sonar API — Search-Grounded LLM in One Call is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Perplexity Sonar API — Search-Grounded LLM in One Call?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cette page est affichée en anglais. Une traduction française est en cours.

KnowledgeMay 11, 2026·5 min de lecture

Perplexity Sonar API — Search-Grounded LLM in One Call

Name: Perplexity Sonar API — Search-Grounded LLM in One Call
Author: Perplexity

Perplexity Sonar API returns LLM answers grounded in real-time web search with citations. Tiers: sonar / sonar-pro / sonar-reasoning.

Perplexity · Community

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 15/100Stage only

Surface agent

Tout agent MCP/CLI

Type

Knowledge

Installation

Stage only

Confiance

Confiance : New

Point d'entrée

Asset

Commande CLI universelle

npx tokrepo install 25b2aa98-cc43-4d6c-b654-5baa3f3c9f62

contrat d'installation JSON metadata plan adaptateur contenu raw

Introduction

Perplexity's Sonar API is a one-call alternative to building search + scrape + chunk + RAG yourself — you send a question, Perplexity searches the web in real time and returns an LLM answer with inline numbered citations to the source URLs. Three tiers: sonar (fast/cheap), sonar-pro (better answer quality, more sources), sonar-reasoning (chain-of-thought, longer think time). Best for: news Q&A, fact-checking, anywhere you need a fresh answer with sources. Works with: OpenAI-compatible client (Python, JS), curl, LangChain. Setup time: 2 minutes.

Python (openai-compatible)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.perplexity.ai",
    api_key=os.environ["PPLX_API_KEY"],
)

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What are the top 3 AI funding rounds this week?"}],
)
print(resp.choices[0].message.content)
# Response includes inline citations like [1][2][3]

# Read citation URLs separately
print(resp.citations)   # ["https://...", "https://...", "https://..."]

Filter sources by domain or recency

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What's the latest Anthropic announcement?"}],
    extra_body={
        "search_domain_filter": ["anthropic.com", "techcrunch.com"],   # whitelist
        "search_recency_filter": "week",                                # day | week | month | year
        "return_images": False,
        "return_related_questions": True,
    },
)

Model tiers (May 2026)

Model	Use case	Cost ($/1M)	Latency
`sonar`	Quick lookups, single-source Q&A	$1 in / $1 out	~1–3s
`sonar-pro`	Production answer quality, multi-source	$3 in / $15 out	~3–7s
`sonar-reasoning`	Hard reasoning, citations + thinking	$1 in / $5 out	~10–25s
`sonar-reasoning-pro`	Top quality reasoning	$2 in / $8 out	~15–40s
`sonar-deep-research`	Long research reports with 30+ sources	$2 in / $8 out + per-search	~minutes

When NOT to use Sonar

If your data is private, not on the web, or in your own corpus — use a private RAG pipeline (e.g., Tavily + your vector store). Sonar searches public web only.

FAQ

Q: Sonar vs Grok Live Search vs Tavily? A: Grok bundles search into the same model call cheaply. Sonar gives stronger answer quality and richer citations. Tavily is search-only (you bring your own LLM). Use Sonar when answer quality matters; Tavily when you need control over the LLM stage.

Q: Are citations clickable? A: Citations come back as a citations array of URLs separately from the markdown answer. Render them as numbered footnotes in your UI. Sonar's content also embeds [1], [2] inline so you can map them visually.

Q: Rate limits? A: Standard tier: ~50 RPM on sonar, ~20 RPM on sonar-pro. Higher tiers in console.perplexity.ai. For production scaling beyond, talk to Perplexity Sales — they offer dedicated capacity.

Quick Use

Get PPLX_API_KEY at perplexity.ai/settings/api
OpenAI(base_url='https://api.perplexity.ai', api_key=PPLX_KEY)
Use model='sonar-pro' and read resp.citations for source URLs

Intro

Python (openai-compatible)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.perplexity.ai",
    api_key=os.environ["PPLX_API_KEY"],
)

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What are the top 3 AI funding rounds this week?"}],
)
print(resp.choices[0].message.content)
# Response includes inline citations like [1][2][3]

# Read citation URLs separately
print(resp.citations)   # ["https://...", "https://...", "https://..."]

Filter sources by domain or recency

resp = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "What's the latest Anthropic announcement?"}],
    extra_body={
        "search_domain_filter": ["anthropic.com", "techcrunch.com"],   # whitelist
        "search_recency_filter": "week",                                # day | week | month | year
        "return_images": False,
        "return_related_questions": True,
    },
)

Model tiers (May 2026)

Model	Use case	Cost ($/1M)	Latency
`sonar`	Quick lookups, single-source Q&A	$1 in / $1 out	~1–3s
`sonar-pro`	Production answer quality, multi-source	$3 in / $15 out	~3–7s
`sonar-reasoning`	Hard reasoning, citations + thinking	$1 in / $5 out	~10–25s
`sonar-reasoning-pro`	Top quality reasoning	$2 in / $8 out	~15–40s
`sonar-deep-research`	Long research reports with 30+ sources	$2 in / $8 out + per-search	~minutes

When NOT to use Sonar

If your data is private, not on the web, or in your own corpus — use a private RAG pipeline (e.g., Tavily + your vector store). Sonar searches public web only.

FAQ

Source & Thanks

Built by Perplexity. Sonar API docs at docs.perplexity.ai.

Official SDK pending; OpenAI-compatible client works today.

🙏

Source et remerciements

Built by Perplexity. Sonar API docs at docs.perplexity.ai.

Official SDK pending; OpenAI-compatible client works today.

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

Weave — Trace and Debug LLM Apps

Weave adds tracing to LLM apps with `@weave.op`. Install `weave`, call `weave.init()`, then track inputs/outputs across API calls and validation steps.

Knowledge

Agent Toolkit

Helicone Sessions — Group LLM Calls by User Conversation

Helicone Sessions group multiple LLM calls under one session ID. Trace a multi-step agent run end-to-end, see total cost, latency, conversation flow.

Knowledge

Helicone

Helicone Cache — Cut LLM Spend with Drop-In Response Caching

Helicone Cache short-circuits identical LLM requests at the proxy. Set Helicone-Cache-Enabled header, exact-match responses come back in ms at zero cost.

Knowledge

Helicone

Statewave — Memory Runtime for AI Agents (API + SDKs)

Statewave is a self-hostable memory runtime: ingest episodes, compile memories, do semantic search, and build token-bounded context bundles via REST.

Knowledge

AI Open Source