What is Bifrost CLI?
Bifrost is an enterprise AI gateway that provides a unified OpenAI-compatible API across 20+ AI providers and 1000+ models. The CLI component lets developers run Claude Code, Codex CLI, Gemini CLI, and other coding agents with any model from any provider. Override each Claude Code model tier independently — run GPT-5 for Sonnet tier, Gemini 2.5 Pro for Opus tier, Groq for Haiku tier.
Answer-Ready: Bifrost CLI is an AI gateway for running Claude Code with any LLM provider. Unified API across 20+ providers (OpenAI, Gemini, Bedrock, Groq, etc.), automatic failover, semantic caching, and per-tier model overrides. Sub-100 microsecond overhead. 3.6k+ GitHub stars.
Best for: Teams wanting model flexibility and provider redundancy for AI coding agents. Works with: Claude Code, Codex CLI, Gemini CLI, Cursor, Roo Code. Setup time: Under 2 minutes.
Core Features
1. Per-Tier Model Override
# Use different models for different Claude Code tiers
tiers:
opus: "google/gemini-2.5-pro"
sonnet: "openai/gpt-5"
haiku: "groq/llama-3.3-70b"2. 20+ Supported Providers
| Provider | Models |
|---|---|
| OpenAI | GPT-5, GPT-4o |
| Anthropic | Claude Opus, Sonnet |
| Gemini 2.5 Pro/Flash | |
| AWS Bedrock | All Bedrock models |
| Azure OpenAI | Azure-hosted models |
| Groq | Ultra-fast inference |
| Cerebras | Fast inference |
| Mistral | Mistral Large, Codestral |
| Cohere | Command R+ |
| xAI | Grok |
| Ollama | Local models |
3. Automatic Failover
# If primary fails, automatically falls back
routes:
- provider: openai
priority: 1
- provider: anthropic
priority: 2
- provider: groq
priority: 34. Real-Time Monitoring
# Dashboard at localhost:8080/logs
# Track: requests, latency, tokens, costs, errors5. Semantic Caching
Cache similar requests to reduce costs and latency. Configurable similarity threshold.
6. Performance
| Metric | Value |
|---|---|
| Overhead | <100 microseconds |
| Throughput | 5,000 RPS |
| Caching | Semantic similarity |
Supported Agents
| Agent | Integration |
|---|---|
| Claude Code | MCP or API proxy |
| Codex CLI | API proxy |
| Gemini CLI | API proxy |
| Cursor | API proxy |
| Roo Code | API proxy |
| Qwen Code | API proxy |
FAQ
Q: How does it compare to LiteLLM? A: Bifrost claims 50x faster than LiteLLM with sub-100 microsecond overhead. Enterprise features include budget management and governance.
Q: Can I use it for cost optimization? A: Yes, route simple tasks to cheap/fast providers (Groq, Cerebras) and complex tasks to premium models. Budget limits per project.
Q: Is it open source? A: Yes, Apache 2.0. Maxim also offers a managed cloud version.