Azure Machine Learning Inference Server (HTTP)
The `azureml-inference-server-http` library provides the core HTTP server runtime for deploying machine learning models on Azure Machine Learning. It hosts user-defined model scripts (`score.py`) by loading `init()` and `run()` functions, allowing models to be exposed via a FastAPI-based REST API. As of version 1.5.1, it continues to evolve primarily through internal improvements and stability updates, with a focus on seamless integration into the Azure ML ecosystem.
Common errors
-
Failed to find 'init' function in entry script.
cause The `score.py` file does not contain a globally defined `init()` function, which is required by the inference server.fixEnsure your `score.py` contains `def init():` at the module level. This function is called once when the server starts. -
Failed to find 'run' function in entry script.
cause The `score.py` file does not contain a globally defined `run(raw_data)` function, which is required for processing inference requests.fixEnsure your `score.py` contains `def run(raw_data):` at the module level. This function is called for each incoming inference request. -
ModuleNotFoundError: No module named 'your_custom_package'
cause A Python package imported by your `score.py` (or its dependencies) is not present in the environment where the inference server is running.fixFor local testing, `pip install your_custom_package` in your active environment. For Azure ML deployments, ensure 'your_custom_package' is listed in your `conda_env.yml` (or `requirements.txt`) file specified during deployment. -
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
cause The `run(raw_data)` function received input that is not valid JSON, or `raw_data` was not correctly parsed as a string.fixEnsure your client sends a valid JSON string in the request body (e.g., `Content-Type: application/json`). Inside your `run` function, add robust error handling for `json.loads(raw_data)`.
Warnings
- gotcha This library is primarily a server runtime, not a client library for direct programmatic interaction with ML models. Users define `init()` and `run()` functions in a `score.py` file that this server loads and executes.
- gotcha Local testing requires setting specific environment variables, most notably `AZUREML_ENTRY_SCRIPT`, to point the server to your `score.py` file.
- breaking The exact signature requirements for `init()` and `run()` in `score.py` have been stable, but future versions might introduce subtle changes or new optional parameters.
- gotcha All Python package dependencies for your model code within `score.py` must be explicitly managed and installed into the server's environment. `pip install azureml-inference-server-http` does not install your model's dependencies.
Install
-
pip install azureml-inference-server-http
Imports
- HttpServer
from azureml.inference.server.http import HttpServer
Users define `init()` and `run()` functions in a `score.py` file.
Quickstart
# Create a file named 'score.py'
# ---
# import json
#
# def init():
# global model
# # In a real scenario, load your model here, e.g., from a file.
# model = {"status": "initialized"}
#
# def run(raw_data):
# try:
# data = json.loads(raw_data)
# prediction = f"Model received input: {data.get('input', 'no input')} and is {model['status']}"
# return json.dumps({"output": prediction})
# except Exception as e:
# return json.dumps({"error": str(e)})
# ---
# To run locally, save the above to 'score.py' and execute in your terminal:
# export AZUREML_ENTRY_SCRIPT=score.py
# python -m azureml.inference.server.http.http_server
#
# Then, send a request to http://localhost:5001/score
# Example with curl:
# curl -X POST -H "Content-Type: application/json" -d '{"input": "example data"}' http://localhost:5001/score