Practical Notes
- Quant: start with 1 sample MCP server image and measure rollout time end-to-end.
- Quant: define SLOs (p95 latency, error rate) for gateway vs direct stdio usage.
Pattern: separate tool hosting from tool use
A gateway becomes useful when you have multiple MCP servers and multiple clients.
Operational guidance:
- Treat MCP servers as deployable artifacts (images) with versioning.
- Use the gateway as the control plane for discovery and routing.
- Keep auth/network policies at the gateway boundary.
Migration tip
Start by routing a single non-critical tool through the gateway, then expand coverage once observability and failure handling are solid.
FAQ
Q: Is this only for Kubernetes? A: The README focuses on Docker + Kubernetes deployments, plus an Azure deployment path.
Q: What should I gateway first? A: A low-risk tool with clear success criteria and measurable latency.
Q: How do I keep versions sane? A: Version images, pin gateway releases, and roll out with canaries + metrics.