Azure AI Inference Client Library

1.0.0b9 · active · verified Thu Apr 09

The Microsoft Azure AI Inference Client Library for Python provides a unified interface for interacting with various Azure AI inference capabilities, such as chat completions and text embeddings. It is currently in a preview/beta state (version 1.0.0b9) and follows the standard Azure SDK release cadence with frequent updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `AzureAIInferenceClient`, authenticate using `DefaultAzureCredential`, and perform a basic chat completion request. Ensure your `AZURE_AI_INFERENCE_ENDPOINT` environment variable is set and you are logged into Azure CLI or have appropriate service principal environment variables configured.

import os
from azure.ai.inference import AzureAIInferenceClient
from azure.ai.inference.models import ChatCompletionsOptions
from azure.identity import DefaultAzureCredential

# Set your Azure AI Inference service endpoint as an environment variable:
# export AZURE_AI_INFERENCE_ENDPOINT="https://<your-service-name>.inference.ai.azure.com/"
endpoint = os.environ.get("AZURE_AI_INFERENCE_ENDPOINT", "")

if not endpoint:
    raise ValueError("Please set the AZURE_AI_INFERENCE_ENDPOINT environment variable.")

# Authenticate using DefaultAzureCredential (checks environment variables, Azure CLI, etc.)
credential = DefaultAzureCredential()

try:
    # Initialize the client
    client = AzureAIInferenceClient(endpoint=endpoint, credential=credential)

    # Example: Chat Completion
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ]
    chat_options = ChatCompletionsOptions(
        messages=messages,
        model="gpt-35-turbo",  # Replace with your deployed model name
        max_tokens=128
    )

    print(f"Sending chat completion request to model: {chat_options.model}...")
    response = client.chat.completions(chat_options)

    for choice in response.choices:
        print(f"Assistant: {choice.message.content}")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Ensure AZURE_AI_INFERENCE_ENDPOINT is set and you are authenticated to Azure (e.g., via `az login`).")

view raw JSON →