{"id":7978,"library":"bentoml","title":"BentoML","description":"BentoML is an open-source framework for building, shipping, and scaling AI applications. It allows developers to create production-ready API endpoints from machine learning models, bundle them into 'Bentos' (deployable archives), and serve them via a unified API server. It currently supports a wide range of ML frameworks and provides tools for model management, API orchestration, and deployment to various platforms. BentoML is actively maintained with frequent patch releases and regular minor version updates.","status":"active","version":"1.4.38","language":"python","source_language":"en","source_url":"https://github.com/bentoml/bentoml","tags":["MLOps","model serving","deployment","API","AI"],"install":[{"cmd":"pip install bentoml","lang":"bash","label":"Install BentoML"},{"cmd":"pip install \"bentoml[transformers]\" # For Transformers models\npip install \"bentoml[pytorch]\" # For PyTorch models","lang":"bash","label":"Install with optional ML framework support"}],"dependencies":[{"reason":"Common ML framework used with BentoML for model serving. Installable via `pip install \"bentoml[sklearn]\"`.","package":"scikit-learn","optional":true},{"reason":"Popular deep learning framework. Installable via `pip install \"bentoml[pytorch]\"`.","package":"torch","optional":true},{"reason":"Popular deep learning framework. Installable via `pip install \"bentoml[tensorflow]\"`.","package":"tensorflow","optional":true}],"imports":[{"symbol":"Service","correct":"from bentoml import Service"},{"symbol":"bentoml.io","correct":"from bentoml.io import JSON, NumpyNdarray, Image # Or other I/O types"},{"note":"The `bentoml.pickler` module and direct `save_model` function was part of the 0.x API. In 1.x, use `bentoml.models.save` for model management.","wrong":"bentoml.pickler.save_model(...)","symbol":"models.save","correct":"import bentoml\nbentoml.models.save(...)"},{"note":"The class for defining an ML service was renamed from `BentoService` to `Service` in the 1.0 release.","wrong":"from bentoml import BentoService","symbol":"bentoml.BentoService","correct":"from bentoml import Service"},{"note":"The `bentoml.artifacts` and `bentoml.adapters` modules were removed in the 1.0 release. Model loading is now handled via `bentoml.models.get` and I/O handling via `bentoml.io`.","wrong":"from bentoml.adapters import DataframeInput","symbol":"bentoml.artifacts","correct":"import bentoml\n# Access models via bentoml.models.get or by name in Service constructor"}],"quickstart":{"code":"import bentoml\nfrom bentoml.io import JSON\nfrom pydantic import BaseModel\nimport os\n\n# Define a Pydantic model for input data validation\nclass InputData(BaseModel):\n    name: str\n    age: int\n\n# Define a simple prediction function to be served\ndef greeter_predict(input_data: InputData) -> dict:\n    return {\"greeting\": f\"Hello, {input_data.name}! You are {input_data.age} years old.\"}\n\n# Save a dummy 'greeter' model (in real apps, this would be a trained ML model)\n# This model is essentially just the greeter_predict function itself for demonstration\n# In a real scenario, you'd save a scikit-learn model, a PyTorch model, etc.\nmodel_tag = bentoml.models.save(\n    name=\"greeter_model\",\n    obj=greeter_predict, # Saving the function directly for this simple example\n    signatures={\n        \"predict\": {\"batchable\": False}\n    }\n)\n\n# Create a BentoML Service\n# The 'models' argument tells BentoML which models this service depends on\nsvc = bentoml.Service(\n    name=\"greeter_service\",\n    models=[model_tag] # Reference the saved model by its tag\n)\n\n# Define an API endpoint using the saved model\n# The `input` and `output` decorators specify the data types for the API\n@svc.api(input=JSON(pydantic_model=InputData), output=JSON())\nasync def greet(input_data: InputData) -> dict:\n    # Load the model from the BentoML model store\n    greeter_model = await bentoml.models.get(model_tag.name)\n    # Call the model's signature (in this case, our saved function)\n    result = greeter_model.predict(input_data)\n    return result\n\n# To run this service locally:\n# 1. Save the code above as `service.py`\n# 2. Run in terminal: `bentoml serve service.py:svc --reload`\n# 3. Access at http://localhost:3000/greet with a POST request, e.g.:\n#    curl -X POST -H \"Content-Type: application/json\" \\\n#    -d '{\"name\": \"Alice\", \"age\": 30}' \\\n#    http://localhost:3000/greet\n","lang":"python","description":"This quickstart demonstrates how to define a `bentoml.Service`, save a dummy function as a 'model' using `bentoml.models.save`, and expose it via an API endpoint with `bentoml.io` for input/output handling. It shows how to serve it locally and interact with it via `curl`."},"warnings":[{"fix":"Refer to the official BentoML 1.0 migration guide. Rewrite service definitions, model saving/loading, and API input/output definitions according to the new `Service`, `bentoml.models`, and `bentoml.io` patterns.","message":"Major API overhaul in BentoML 1.0. The entire API was redesigned, making 0.x code incompatible with 1.x. Key changes include `BentoService` renamed to `Service`, removal of `bentoml.artifacts` and `bentoml.adapters`, and a new `bentoml.io` module for I/O handling.","severity":"breaking","affected_versions":"0.x to 1.x"},{"fix":"Ensure `bentoml.models.save()` was successfully called and printed a model tag. Verify that the `bentoml.Service` constructor includes the correct model tags in its `models` argument. For `bentoml build`, ensure your `bentofile.yaml` correctly lists all required models.","message":"Model not found errors during `bentoml serve` or `bentoml build` indicate that the service cannot locate the specified model. This often happens if the model was not saved correctly or if the service's `models` list doesn't correctly reference it.","severity":"gotcha","affected_versions":"1.0+"},{"fix":"Configure resources in `bentofile.yaml` under `resource_quota` or specify them directly when running `bentoml serve` (e.g., `--workers 4`, `--gpus 1`). For more complex scenarios, consider custom Runners.","message":"Incorrect resource allocation (CPU, GPU workers) can lead to underutilization or over-provisioning. BentoML defaults to managing workers based on available resources, but for optimal performance, explicit configuration is often needed.","severity":"gotcha","affected_versions":"1.0+"},{"fix":"Always explicitly declare all project dependencies (including ML frameworks like `torch`, `tensorflow`, `scikit-learn`) in the `python.requirements.pip` section of `bentofile.yaml` or in a `requirements.txt` file referenced by `bentofile.yaml`.","message":"When building a Bento, external Python dependencies not explicitly listed in `bentofile.yaml` or `requirements.txt` will be missing in the deployed environment, leading to `ModuleNotFoundError`.","severity":"gotcha","affected_versions":"1.0+"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Run `pip install bentoml` to install the library.","cause":"BentoML library is not installed in the active Python environment.","error":"ModuleNotFoundError: No module named 'bentoml'"},{"fix":"Ensure that `bentoml.models.save()` was executed successfully for 'your_model_name' and that the `bentoml.Service` constructor correctly references the model tag (e.g., `models=[bentoml.models.get('your_model_name')]`). Verify the model name and version are correct.","cause":"The BentoML service or build command cannot find the specified model in the local model store.","error":"bentoml.exceptions.NotFound: Model 'your_model_name:latest' not found"},{"fix":"Convert the Pydantic model to a dictionary before returning it (e.g., `return my_pydantic_model.dict()`) or ensure your Pydantic model implements a `json()` method if you intended to use that directly.","cause":"You are trying to return a Pydantic model directly from an API endpoint declared with `output=JSON()` without proper serialization.","error":"TypeError: Object of type <PydanticModel> is not JSON serializable"},{"fix":"Ensure you are using `await` for async calls within async functions. If calling sync code from async, or vice-versa, ensure correct thread management or consider using `anyio.to_thread.run_sync` for blocking operations within async APIs, or `asyncio.run` for running top-level async functions.","cause":"An asynchronous API endpoint (defined with `async def`) or an async operation is being called from a synchronous context or without a proper async event loop managed by AnyIO/asyncio.","error":"RuntimeError: No event loop is running in current thread."},{"fix":"This issue is often mitigated in newer BentoML versions with increased SQLite busy timeout and WAL mode. Ensure you are on a recent version (>=1.4.36). If it persists, ensure only one process accesses the database at a time or consider using a persistent model store backend for production.","cause":"Concurrent access attempts to the SQLite database used by BentoML for metadata storage, often occurring in high-concurrency scenarios or when `bentoml serve` is killed uncleanly.","error":"OperationalError: database is locked"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null,"pypi_latest":null,"cli_name":"bentoml"}