Scripts2026年4月1日·1 分钟阅读

BentoML — Build AI Model Serving APIs

BentoML builds model inference REST APIs and multi-model serving systems from Python scripts. 8.6K+ GitHub stars. Auto Docker, dynamic batching, any ML framework. Apache 2.0.

TO
TokRepo精选 · Community
快速使用

先拿来用,再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

# Install
pip install -U bentoml

# Create a service (service.py)
cat > service.py << 'EOF'
import bentoml

@bentoml.service
class Summarizer:
    def __init__(self):
        from transformers import pipeline
        self.pipeline = pipeline("summarization")

    @bentoml.api
    def summarize(self, text: str) -> str:
        result = self.pipeline(text, max_length=100)
        return result[0]["summary_text"]
EOF

# Serve locally
bentoml serve service:Summarizer

# Build Docker container
bentoml build && bentoml containerize summarizer:latest

介绍

BentoML is a Python framework for building online serving systems optimized for AI apps and model inference. With 8,600+ GitHub stars and Apache 2.0 license, it turns model inference scripts into production REST APIs using Python type hints, automatically generates Docker containers with dependency management, provides performance optimization through dynamic batching and model parallelism, and supports any ML framework and inference runtime. Deploy to Docker or BentoCloud for production.

Best for: ML engineers deploying models as production APIs with minimal boilerplate Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Frameworks: PyTorch, TensorFlow, HuggingFace, ONNX, XGBoost, any runtime


Key Features

  • Python-first: Type hints auto-generate REST API schema
  • Auto Docker: One command to containerize with all dependencies
  • Dynamic batching: Automatically batch requests for throughput
  • Model parallelism: Multi-GPU and multi-model serving
  • Any framework: PyTorch, TensorFlow, HuggingFace, ONNX, XGBoost
  • BentoCloud: Managed deployment with auto-scaling

FAQ

Q: What is BentoML? A: BentoML is a Python framework with 8.6K+ stars for turning ML models into production REST APIs. Auto Docker, dynamic batching, any framework. Apache 2.0.

Q: How do I install BentoML? A: pip install -U bentoml. Decorate your class with @bentoml.service, methods with @bentoml.api, then bentoml serve.


🙏

来源与感谢

Created by BentoML. Licensed under Apache 2.0. bentoml/BentoML — 8,600+ GitHub stars

相关资产