Azure AI Inference Client Library
The Microsoft Azure AI Inference Client Library for Python provides a unified interface for interacting with various Azure AI inference capabilities, such as chat completions and text embeddings. It is currently in a preview/beta state (version 1.0.0b9) and follows the standard Azure SDK release cadence with frequent updates.
Warnings
- breaking This library is currently in a beta state (`1.0.0b9`). Breaking changes are expected in future releases until it reaches a stable (1.0.0) version. Public API surface and underlying models may change without prior notice.
- gotcha Authentication is required and often done via `DefaultAzureCredential`. This credential provider attempts various authentication methods (environment variables, Azure CLI, managed identity). If not configured correctly, it will fail to connect.
- gotcha The Azure AI Inference service endpoint must be provided to the client. This is typically done via the `AZURE_AI_INFERENCE_ENDPOINT` environment variable or passed directly to the `AzureAIInferenceClient` constructor. Forgetting this will lead to connection errors.
Install
-
pip install azure-ai-inference azure-identity
Imports
- AzureAIInferenceClient
from azure.ai.inference import AzureAIInferenceClient
- ChatCompletionsOptions
from azure.ai.inference.models import ChatCompletionsOptions
- DefaultAzureCredential
from azure.identity import DefaultAzureCredential
Quickstart
import os
from azure.ai.inference import AzureAIInferenceClient
from azure.ai.inference.models import ChatCompletionsOptions
from azure.identity import DefaultAzureCredential
# Set your Azure AI Inference service endpoint as an environment variable:
# export AZURE_AI_INFERENCE_ENDPOINT="https://<your-service-name>.inference.ai.azure.com/"
endpoint = os.environ.get("AZURE_AI_INFERENCE_ENDPOINT", "")
if not endpoint:
raise ValueError("Please set the AZURE_AI_INFERENCE_ENDPOINT environment variable.")
# Authenticate using DefaultAzureCredential (checks environment variables, Azure CLI, etc.)
credential = DefaultAzureCredential()
try:
# Initialize the client
client = AzureAIInferenceClient(endpoint=endpoint, credential=credential)
# Example: Chat Completion
messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "What is the capital of France?"},
]
chat_options = ChatCompletionsOptions(
messages=messages,
model="gpt-35-turbo", # Replace with your deployed model name
max_tokens=128
)
print(f"Sending chat completion request to model: {chat_options.model}...")
response = client.chat.completions(chat_options)
for choice in response.choices:
print(f"Assistant: {choice.message.content}")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure AZURE_AI_INFERENCE_ENDPOINT is set and you are authenticated to Azure (e.g., via `az login`).")