Skills2026年5月11日·1 分钟阅读

Seldon Core — ML Model Serving on Kubernetes

An MLOps framework for deploying, monitoring, and managing machine learning models at scale on Kubernetes.

Agent 就绪

这个资产可以被 Agent 直接读取和安装

TokRepo 同时提供通用 CLI 命令、安装契约、metadata JSON、按适配器生成的安装计划和原始内容链接,方便 Agent 判断适配度、风险和下一步动作。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
Seldon Core Overview
通用 CLI 安装命令
npx tokrepo install f0a44fcc-4cd0-11f1-9bc6-00163e2b0d79

Introduction

Seldon Core is an open-source platform for deploying machine learning models on Kubernetes. It converts trained models into production REST/gRPC microservices with built-in monitoring, A/B testing, and canary deployments, bridging the gap between data science and production operations.

What Seldon Core Does

  • Deploys ML models as Kubernetes-native microservices via custom resources
  • Supports pre-built servers for scikit-learn, XGBoost, TensorFlow, PyTorch, and Triton
  • Provides inference graphs for multi-model pipelines with routers, combiners, and transformers
  • Enables A/B testing and canary rollouts with traffic splitting
  • Integrates with Prometheus and Grafana for request metrics and model monitoring

Architecture Overview

Seldon Core runs as a Kubernetes operator that watches SeldonDeployment custom resources. When a deployment is created, the operator generates the required pods, services, and Istio/Ambassador virtual services. Each inference server wraps the model in a standardized REST/gRPC interface. An orchestrator sidecar routes requests through multi-step inference graphs, handling request transformation and response aggregation.

Self-Hosting & Configuration

  • Install via Helm into any Kubernetes cluster (1.18+)
  • Requires an ingress controller (Istio or Ambassador) for external access
  • Define models as SeldonDeployment YAML manifests with modelUri pointing to S3, GCS, or PVC
  • Configure autoscaling with Kubernetes HPA or KEDA
  • Enable request logging by routing prediction payloads to Elasticsearch

Key Features

  • Language-agnostic model wrapping with pre-built and custom inference servers
  • Inference graph DSL for chaining models, transformers, and routers
  • Drift and outlier detection via Alibi Detect integration
  • Explainability endpoints using Alibi Explain for model transparency
  • V2 inference protocol compatible with KServe and Triton standards

Comparison with Similar Tools

  • KServe — lighter-weight serverless inference; Seldon Core offers richer inference graph composition
  • BentoML — packaging-focused with BentoCloud; Seldon Core is Kubernetes-native from the start
  • Triton Inference Server — NVIDIA runtime engine; Seldon Core orchestrates Triton as one backend
  • Ray Serve — Python-first with Ray ecosystem; Seldon Core uses Kubernetes-native deployment model

FAQ

Q: Which model frameworks does Seldon Core support? A: scikit-learn, XGBoost, TensorFlow, PyTorch, ONNX, Triton, and custom Python/Java/Go servers.

Q: Can I run Seldon Core without Istio? A: Yes. Ambassador or nginx ingress controllers work as alternatives to Istio.

Q: How does Seldon Core handle model versioning? A: Deploy multiple model versions as separate predictors with traffic-split percentages for gradual rollout.

Q: Is there a managed cloud offering? A: Yes. Seldon Deploy provides an enterprise platform with a management UI, audit trails, and enhanced monitoring.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产