Esta página se muestra en inglés. Una traducción al español está en curso.

MCP ConfigsApr 4, 2026·3 min de lectura

Pal MCP Server — Multi-Model AI Gateway for Claude Code

MCP server that lets Claude Code use Gemini, OpenAI, Grok, and Ollama as a unified AI dev team. Features model routing, CLI-to-CLI bridge, and conversation continuity across 7+ providers.

TokRepo精选 · Community

Listo para agents

Staging seguro para este activo

Este activo primero queda en staging. El prompt copiado pide inspeccionar los archivos staged antes de activar scripts, config MCP o config global.

Stage only · 17/100Política: staging

Superficie agent

Cualquier agent MCP/CLI

Tipo

Mcp Config

Instalación

Stage only

Confianza

Confianza: Established

Entrada

Pal MCP Server — Multi-Model AI Gateway for Claude Code

Comando de staging seguro

npx -y tokrepo@latest install 09c904b2-4bf7-4f1e-acf5-55cd465b6227 --target codex

Primero deja archivos en staging; la activación requiere revisar el README y el plan staged.

TL;DR

Pal MCP is an MCP server that adds Gemini, GPT-4, Grok, and Ollama to Claude Code as callable sub-agents. One config, seven providers.

§01

Why multi-model matters inside one agent

Claude Code is excellent at reasoning. Gemini 2.5 Pro has a 2M context window. GPT-4o is fast. Grok has live web access. Ollama runs offline. A real dev team uses all of them. Pal MCP collapses that into one tool call from Claude Code's perspective — ask it to "call Gemini on this 1.5M-token codebase" and Pal routes the request, returns the result, and maintains conversation continuity.

§02

Single-config setup

Add to .mcp.json:

{
  "mcpServers": {
    "pal": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/BeehiveInnovations/pal-mcp-server.git", "pal-mcp-server"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-key",
        "OPENAI_API_KEY": "your-openai-key",
        "DEFAULT_MODEL": "auto"
      }
    }
  }
}

Restart Claude Code. Now pal_chat, pal_route, and pal_continue are callable.

§03

The routing logic

Set DEFAULT_MODEL=auto and Pal picks a model based on task heuristics:

Task signal	Routed model	Why
Context > 200K tokens	Gemini 2.5 Pro	2M context window
Needs live web facts	Grok	Twitter/X integration
Code completion loops	Ollama Codellama	Free, fast, local
Long reasoning chains	o3-preview	Best deliberation
Default	Claude Sonnet	Quality baseline

Override per-call with pal_chat(model="gpt-4o").

§04

CLI-to-CLI bridge

Pal exposes a raw CLI bridge: call Aider, Continue, or any CLI-based agent from within Claude Code. Useful for chaining specialized agents in a single workflow.

§05

Conversation continuity

Every Pal call can continue an existing thread:

pal_continue(thread_id="xyz", prompt="refactor based on Gemini's suggestions")

Thread state is persisted in SQLite under ~/.pal/threads.db. Survives restarts.

§06

Supported providers in 2026

Anthropic (Claude Opus, Sonnet, Haiku)
OpenAI (GPT-4o, o3, o3-mini)
Google (Gemini 2.5 Pro, Flash)
xAI (Grok-3)
DeepSeek (R1, V3)
Ollama (local, 50+ models)
LiteLLM (proxy for 100+ more)

§07

Cost control

Pal emits a cost-summary per session: total tokens, per-model breakdown, $ estimate. Use MAX_COST_PER_SESSION=5 env var to hard-stop runaway loops.

§08

When Pal is not the right choice

Single-model workflows — overhead not worth it, use the provider SDK directly.
Production agents — MCP is still evolving; use LiteLLM Proxy for production-grade routing.
Compliance-regulated environments — each upstream provider has different data policies; Pal doesn't unify compliance.

Preguntas frecuentes

How is Pal different from LiteLLM?+

LiteLLM is a Python proxy library designed for production backends. Pal is an MCP server designed for interactive use inside agents like Claude Code. Pal adds thread continuity and CLI bridging that LiteLLM does not provide, but LiteLLM has stronger production-grade features like retries and load balancing.

Does Pal support local models?+

Yes. Ollama is a first-class provider. Point Pal at your local Ollama instance with OLLAMA_BASE_URL and it will route appropriate tasks to your local models. Useful for offline work or privacy-sensitive data.

Can I use Pal outside Claude Code?+

Yes. Any MCP-compatible client works: Cursor, Codex CLI, Zed, Cline, and others. The MCP protocol is standardized so Pal behaves identically across them.

Is there a cost guardrail?+

Yes. Set MAX_COST_PER_SESSION environment variable to hard-stop sessions that exceed the limit. Pal also emits a per-call cost summary so you can track spending in real time.

Which provider does Pal default to?+

With DEFAULT_MODEL=auto, Pal picks based on task heuristics — Gemini for huge context, Grok for live web facts, Ollama for local code completion, o3 for long reasoning, Claude Sonnet as the quality baseline.

Referencias (3)

Pal MCP GitHub— Supports Gemini, OpenAI, Grok, DeepSeek, Ollama, and LiteLLM proxy
Model Context Protocol— MCP protocol specification by Anthropic
Google DeepMind— Gemini 2.5 Pro has a 2M token context window

Relacionados en TokRepo

All MCP servers AI gateways compared Multi-agent frameworks

🙏

Fuente y agradecimientos

Created by BeehiveInnovations. Licensed under custom license.

pal-mcp-server — ⭐ 11,300+

Thank you for building a powerful multi-model gateway for the AI developer community.

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

Linear MCP — Project Management for Claude Code & Cursor

Linear's official MCP lets Claude Code, Cursor, Codex CLI manage Linear issues, projects, cycles. Search by status, create issues from chat, link PRs.

MCP Configs

Linear

OpenRouter MCP — One Server for 300+ LLMs in Claude Code

OpenRouter MCP exposes all 300+ OpenRouter models to Claude Code, Cursor, Codex CLI as one MCP server. Switch models per task, BYO routing, no extra SDKs.

MCP Configs

OpenRouter

MCP SSH Manager — Remote Ops via Claude/Codex

MCP SSH Manager is an MCP server that lets Claude Code and OpenAI Codex manage SSH sessions: run commands, sync files, and automate DevOps routines.

MCP Configs

MCP Hub

pentest-ai — Offensive Security MCP for Claude Code

pentest-ai is a Python CLI and MCP server that lets Claude Code run verified probes, chain attack paths, and export reports for authorized testing.

MCP Configs

MCP Hub