Cette page est affichée en anglais. Une traduction française est en cours.
WorkflowsApr 8, 2026·3 min de lecture

Bifrost CLI — Run Claude Code with Any AI Model

Enterprise AI gateway that lets Claude Code use any LLM provider. Bifrost routes requests to OpenAI, Gemini, Bedrock, Groq, and 20+ providers with automatic failover.

What is Bifrost CLI?

Bifrost is an enterprise AI gateway that provides a unified OpenAI-compatible API across 20+ AI providers and 1000+ models. The CLI component lets developers run Claude Code, Codex CLI, Gemini CLI, and other coding agents with any model from any provider. Override each Claude Code model tier independently — run GPT-5 for Sonnet tier, Gemini 2.5 Pro for Opus tier, Groq for Haiku tier.

Answer-Ready: Bifrost CLI is an AI gateway for running Claude Code with any LLM provider. Unified API across 20+ providers (OpenAI, Gemini, Bedrock, Groq, etc.), automatic failover, semantic caching, and per-tier model overrides. Sub-100 microsecond overhead. 3.6k+ GitHub stars.

Best for: Teams wanting model flexibility and provider redundancy for AI coding agents. Works with: Claude Code, Codex CLI, Gemini CLI, Cursor, Roo Code. Setup time: Under 2 minutes.

Core Features

1. Per-Tier Model Override

# Use different models for different Claude Code tiers
tiers:
  opus: "google/gemini-2.5-pro"
  sonnet: "openai/gpt-5"
  haiku: "groq/llama-3.3-70b"

2. 20+ Supported Providers

Provider Models
OpenAI GPT-5, GPT-4o
Anthropic Claude Opus, Sonnet
Google Gemini 2.5 Pro/Flash
AWS Bedrock All Bedrock models
Azure OpenAI Azure-hosted models
Groq Ultra-fast inference
Cerebras Fast inference
Mistral Mistral Large, Codestral
Cohere Command R+
xAI Grok
Ollama Local models

3. Automatic Failover

# If primary fails, automatically falls back
routes:
  - provider: openai
    priority: 1
  - provider: anthropic
    priority: 2
  - provider: groq
    priority: 3

4. Real-Time Monitoring

# Dashboard at localhost:8080/logs
# Track: requests, latency, tokens, costs, errors

5. Semantic Caching

Cache similar requests to reduce costs and latency. Configurable similarity threshold.

6. Performance

Metric Value
Overhead <100 microseconds
Throughput 5,000 RPS
Caching Semantic similarity

Supported Agents

Agent Integration
Claude Code MCP or API proxy
Codex CLI API proxy
Gemini CLI API proxy
Cursor API proxy
Roo Code API proxy
Qwen Code API proxy

FAQ

Q: How does it compare to LiteLLM? A: Bifrost claims 50x faster than LiteLLM with sub-100 microsecond overhead. Enterprise features include budget management and governance.

Q: Can I use it for cost optimization? A: Yes, route simple tasks to cheap/fast providers (Groq, Cerebras) and complex tasks to premium models. Budget limits per project.

Q: Is it open source? A: Yes, Apache 2.0. Maxim also offers a managed cloud version.

🙏

Source et remerciements

Created by Maxim. Licensed under Apache 2.0.

maximhq/bifrost — 3.6k+ stars

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires