Quick Use
- Get key at console.x.ai
OpenAI(base_url='https://api.x.ai/v1', api_key=XAI_KEY)- Set
model='grok-3'— done
Intro
xAI's Grok API is OpenAI-compatible — point the OpenAI SDK at api.x.ai/v1 with an XAI_API_KEY, change model='grok-3', and you're running on Grok. Grok-3 is the flagship reasoning model with 1M-token context; Grok-2 Vision handles images; Live Search gives real-time web results inside the call. Best for: apps that need fresh real-time knowledge (news, prices, sports) without a separate retrieval layer; very long context tasks (>200K tokens). Works with: openai-python, openai-node, LangChain, LlamaIndex via OpenAI-compatible adapter. Setup time: 2 minutes.
Python (openai SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.x.ai/v1",
api_key=os.environ["XAI_API_KEY"],
)
resp = client.chat.completions.create(
model="grok-3",
messages=[{"role": "user", "content": "What's the latest on the SpaceX launch this week?"}],
)
print(resp.choices[0].message.content)Vision with Grok-2
resp = client.chat.completions.create(
model="grok-2-vision-latest",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's wrong with this UI screenshot?"},
{"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}},
],
}],
)Live Search (real-time web grounding)
resp = client.chat.completions.create(
model="grok-3",
messages=[{"role": "user", "content": "What's BTC price right now and the top 3 reasons it moved today?"}],
extra_body={
"search_parameters": {
"mode": "on", # off | auto | on
"sources": [{"type": "web"}, {"type": "x"}, {"type": "news"}],
"max_search_results": 8,
}
},
)
print(resp.choices[0].message.content)
print(resp.usage.num_sources_used) # how many sources Grok grounded againstModel lineup
| Model ID | Context | Best for |
|---|---|---|
grok-3 |
1,000,000 | Long-context reasoning, complex agents |
grok-3-mini |
131,072 | Fast cheap reasoning |
grok-2-vision-latest |
32,768 | Image understanding |
grok-2-image-latest |
n/a | Image generation |
Pricing (per 1M tokens, May 2026)
- Grok-3: $5 input / $15 output
- Grok-3-mini: $0.30 input / $0.50 output
Migration from OpenAI
The only changes: base_url and the model string. Tools, vision, JSON mode, streaming — all work identically.
FAQ
Q: How does Grok 1M context compare to Gemini 2M? A: Both work for full-corpus tasks. Grok-3's 1M is denser per dollar at $5/M input. Gemini 2.5 Pro is ~$1.25/M but has stricter rate limits. For 100K–500K typical jobs, Grok is faster end-to-end; for >800K, both work.
Q: What sources does Live Search index?
A: Web (search engine), X (Twitter posts), News (press articles), and RSS. You whitelist via the sources array. Grok returns inline citations and a num_sources_used count for verification.
Q: Is the OpenAI-compat layer feature-complete? A: Mostly — chat.completions, streaming, tools, vision, JSON mode all work. Audio (TTS / Whisper) is not on the xAI API; use OpenAI for those. Embeddings: not yet on xAI as of May 2026.
Source & Thanks
Built by xAI. API docs at docs.x.ai.
Public SDK: xai-org