WorkflowsApr 8, 2026·3 min read

Bifrost CLI — Run Claude Code with Any AI Model

Enterprise AI gateway that lets Claude Code use any LLM provider. Bifrost routes requests to OpenAI, Gemini, Bedrock, Groq, and 20+ providers with automatic failover.

AI
AI Open Source · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

# Install
npx -y @maximhq/bifrost

# Connect Claude Code
claude mcp add --transport http bifrost http://localhost:8080/mcp
# Or Docker
docker run -p 8080:8080 maximhq/bifrost

What is Bifrost CLI?

Bifrost is an enterprise AI gateway that provides a unified OpenAI-compatible API across 20+ AI providers and 1000+ models. The CLI component lets developers run Claude Code, Codex CLI, Gemini CLI, and other coding agents with any model from any provider. Override each Claude Code model tier independently — run GPT-5 for Sonnet tier, Gemini 2.5 Pro for Opus tier, Groq for Haiku tier.

Answer-Ready: Bifrost CLI is an AI gateway for running Claude Code with any LLM provider. Unified API across 20+ providers (OpenAI, Gemini, Bedrock, Groq, etc.), automatic failover, semantic caching, and per-tier model overrides. Sub-100 microsecond overhead. 3.6k+ GitHub stars.

Best for: Teams wanting model flexibility and provider redundancy for AI coding agents. Works with: Claude Code, Codex CLI, Gemini CLI, Cursor, Roo Code. Setup time: Under 2 minutes.

Core Features

1. Per-Tier Model Override

# Use different models for different Claude Code tiers
tiers:
  opus: "google/gemini-2.5-pro"
  sonnet: "openai/gpt-5"
  haiku: "groq/llama-3.3-70b"

2. 20+ Supported Providers

Provider Models
OpenAI GPT-5, GPT-4o
Anthropic Claude Opus, Sonnet
Google Gemini 2.5 Pro/Flash
AWS Bedrock All Bedrock models
Azure OpenAI Azure-hosted models
Groq Ultra-fast inference
Cerebras Fast inference
Mistral Mistral Large, Codestral
Cohere Command R+
xAI Grok
Ollama Local models

3. Automatic Failover

# If primary fails, automatically falls back
routes:
  - provider: openai
    priority: 1
  - provider: anthropic
    priority: 2
  - provider: groq
    priority: 3

4. Real-Time Monitoring

# Dashboard at localhost:8080/logs
# Track: requests, latency, tokens, costs, errors

5. Semantic Caching

Cache similar requests to reduce costs and latency. Configurable similarity threshold.

6. Performance

Metric Value
Overhead <100 microseconds
Throughput 5,000 RPS
Caching Semantic similarity

Supported Agents

Agent Integration
Claude Code MCP or API proxy
Codex CLI API proxy
Gemini CLI API proxy
Cursor API proxy
Roo Code API proxy
Qwen Code API proxy

FAQ

Q: How does it compare to LiteLLM? A: Bifrost claims 50x faster than LiteLLM with sub-100 microsecond overhead. Enterprise features include budget management and governance.

Q: Can I use it for cost optimization? A: Yes, route simple tasks to cheap/fast providers (Groq, Cerebras) and complex tasks to premium models. Budget limits per project.

Q: Is it open source? A: Yes, Apache 2.0. Maxim also offers a managed cloud version.

🙏

Source & Thanks

Created by Maxim. Licensed under Apache 2.0.

maximhq/bifrost — 3.6k+ stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets