Key Features
- Python-first: Type hints auto-generate REST API schema
- Auto Docker: One command to containerize with all dependencies
- Dynamic batching: Automatically batch requests for throughput
- Model parallelism: Multi-GPU and multi-model serving
- Any framework: PyTorch, TensorFlow, HuggingFace, ONNX, XGBoost
- BentoCloud: Managed deployment with auto-scaling
FAQ
Q: What is BentoML? A: BentoML is a Python framework with 8.6K+ stars for turning ML models into production REST APIs. Auto Docker, dynamic batching, any framework. Apache 2.0.
Q: How do I install BentoML?
A: pip install -U bentoml. Decorate your class with @bentoml.service, methods with @bentoml.api, then bentoml serve.