BentoML

1.4.38 · active · verified Thu Apr 16

BentoML is an open-source framework for building, shipping, and scaling AI applications. It allows developers to create production-ready API endpoints from machine learning models, bundle them into 'Bentos' (deployable archives), and serve them via a unified API server. It currently supports a wide range of ML frameworks and provides tools for model management, API orchestration, and deployment to various platforms. BentoML is actively maintained with frequent patch releases and regular minor version updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define a `bentoml.Service`, save a dummy function as a 'model' using `bentoml.models.save`, and expose it via an API endpoint with `bentoml.io` for input/output handling. It shows how to serve it locally and interact with it via `curl`.

import bentoml
from bentoml.io import JSON
from pydantic import BaseModel
import os

# Define a Pydantic model for input data validation
class InputData(BaseModel):
    name: str
    age: int

# Define a simple prediction function to be served
def greeter_predict(input_data: InputData) -> dict:
    return {"greeting": f"Hello, {input_data.name}! You are {input_data.age} years old."}

# Save a dummy 'greeter' model (in real apps, this would be a trained ML model)
# This model is essentially just the greeter_predict function itself for demonstration
# In a real scenario, you'd save a scikit-learn model, a PyTorch model, etc.
model_tag = bentoml.models.save(
    name="greeter_model",
    obj=greeter_predict, # Saving the function directly for this simple example
    signatures={
        "predict": {"batchable": False}
    }
)

# Create a BentoML Service
# The 'models' argument tells BentoML which models this service depends on
svc = bentoml.Service(
    name="greeter_service",
    models=[model_tag] # Reference the saved model by its tag
)

# Define an API endpoint using the saved model
# The `input` and `output` decorators specify the data types for the API
@svc.api(input=JSON(pydantic_model=InputData), output=JSON())
async def greet(input_data: InputData) -> dict:
    # Load the model from the BentoML model store
    greeter_model = await bentoml.models.get(model_tag.name)
    # Call the model's signature (in this case, our saved function)
    result = greeter_model.predict(input_data)
    return result

# To run this service locally:
# 1. Save the code above as `service.py`
# 2. Run in terminal: `bentoml serve service.py:svc --reload`
# 3. Access at http://localhost:3000/greet with a POST request, e.g.:
#    curl -X POST -H "Content-Type: application/json" \
#    -d '{"name": "Alice", "age": 30}' \
#    http://localhost:3000/greet

view raw JSON →