MCP ConfigsApr 4, 2026·3 min read

Pal MCP Server — Multi-Model AI Gateway for Claude Code

MCP server that lets Claude Code use Gemini, OpenAI, Grok, and Ollama as a unified AI dev team. Features model routing, CLI-to-CLI bridge, and conversation continuity across 7+ providers.

TL;DR
MCP server that gives Claude Code access to Gemini, OpenAI, Grok, and Ollama through a single gateway.
§01

What it is

Pal MCP Server is a Model Context Protocol gateway that connects Claude Code to multiple LLM providers through a single configuration. It supports Gemini, OpenAI, Azure OpenAI, X.AI/Grok, OpenRouter, DIAL, and Ollama, letting you route prompts to the best model for each task without switching tools.

The server is designed for developers who want to leverage model diversity from within their existing Claude Code workflow. Instead of context-switching between CLIs, you send requests through Pal and let it handle provider routing, conversation continuity, and response formatting.

§02

How it saves time or tokens

Pal reduces token waste by routing tasks to the most cost-effective model. A simple code formatting request can go to a smaller model, while complex architecture decisions stay with Claude. The CLI-to-CLI bridge feature (called 'clink') maintains conversation context across model switches, so you do not lose context when switching providers mid-session. This avoids re-prompting and the token overhead that comes with it.

§03

How to use

  1. Install prerequisites: Python 3.10+, Git, and uv (pip install uv).
  2. Add the Pal server to your .mcp.json configuration:
{
  "mcpServers": {
    "pal": {
      "command": "bash",
      "args": ["-c", "uvx --from git+https://github.com/BeehiveInnovations/pal-mcp-server.git pal-mcp-server"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "DEFAULT_MODEL": "auto"
      }
    }
  }
}
  1. Restart Claude Code. The tools chat, thinkdeep, planner, and consensus become available for multi-model interaction.
§04

Example

Send a planning request through Pal to get perspectives from multiple models:

# In Claude Code, use the planner tool
'Plan a microservices migration for a monolithic Node.js app'
# Pal routes sub-tasks to different models and merges results

The consensus tool queries multiple providers with the same prompt and synthesizes agreement points, useful for architecture decisions where you want diverse AI perspectives.

§05

Related on TokRepo

  • AI Gateway Providers — Compare gateway solutions like LiteLLM, OpenRouter, and Portkey alongside Pal
  • MCP Integrations — Browse other MCP server configurations for extending Claude Code
§06

Common pitfalls

  • Setting DEFAULT_MODEL to a specific model instead of 'auto' defeats the routing benefit. Keep it on 'auto' unless you have a reason to pin.
  • Each provider needs its own API key in the env block. Missing keys cause silent failures on routes to that provider.
  • Conversation continuity via clink requires all participating models to be available. If one provider is down, the context chain breaks.
  • Always check the official documentation for the latest version-specific changes and migration guides before upgrading in production environments.

Frequently Asked Questions

How many AI providers does Pal MCP Server support?+

Pal supports 7+ providers: Gemini, OpenAI, Azure OpenAI, X.AI/Grok, OpenRouter, DIAL, and Ollama for local models. You configure API keys for each provider you want to use in the .mcp.json env block.

Can Pal work with local models through Ollama?+

Yes. Pal includes Ollama as a supported provider, so you can route requests to locally running models. This is useful for sensitive data that should not leave your machine or for reducing API costs on simple tasks.

What is the clink feature in Pal MCP Server?+

Clink is Pal's CLI-to-CLI bridge that maintains conversation continuity across different AI models. When Claude's context resets, other models can remind Claude of the full discussion history, preventing context loss during long sessions.

Does Pal require any changes to my existing Claude Code setup?+

No. You only add a new entry to your .mcp.json file and restart Claude Code. Pal runs as a standard MCP server alongside any other servers you already have configured.

What tools does Pal expose to Claude Code?+

Pal exposes four main tools: chat (send messages to any model), thinkdeep (extended reasoning with model selection), planner (multi-step project planning across models), and consensus (get agreement from multiple models on a question).

Citations (3)
🙏

Source & Thanks

Created by BeehiveInnovations. Licensed under custom license.

pal-mcp-server — ⭐ 11,300+

Thank you for building a powerful multi-model gateway for the AI developer community.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.