Azure Machine Learning Inference Server (HTTP)

1.5.1 · active · verified Thu Apr 16

The `azureml-inference-server-http` library provides the core HTTP server runtime for deploying machine learning models on Azure Machine Learning. It hosts user-defined model scripts (`score.py`) by loading `init()` and `run()` functions, allowing models to be exposed via a FastAPI-based REST API. As of version 1.5.1, it continues to evolve primarily through internal improvements and stability updates, with a focus on seamless integration into the Azure ML ecosystem.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to create a minimal `score.py` file, which is the standard interface for models running on `azureml-inference-server-http`. It also provides the necessary shell commands to run this server locally for testing. The server exposes a `/score` endpoint for inference requests.

# Create a file named 'score.py'
# ---
# import json
# 
# def init():
#     global model
#     # In a real scenario, load your model here, e.g., from a file.
#     model = {"status": "initialized"}
# 
# def run(raw_data):
#     try:
#         data = json.loads(raw_data)
#         prediction = f"Model received input: {data.get('input', 'no input')} and is {model['status']}"
#         return json.dumps({"output": prediction})
#     except Exception as e:
#         return json.dumps({"error": str(e)})
# ---
# To run locally, save the above to 'score.py' and execute in your terminal:
# export AZUREML_ENTRY_SCRIPT=score.py
# python -m azureml.inference.server.http.http_server
# 
# Then, send a request to http://localhost:5001/score
# Example with curl:
# curl -X POST -H "Content-Type: application/json" -d '{"input": "example data"}' http://localhost:5001/score

view raw JSON →