Azure AI Vision Image Analysis Client Library
The `azure-ai-vision-imageanalysis` library is Microsoft's client library for Python, providing AI algorithms to process images and extract visual features such as captions, text (OCR), and detected objects. It is currently at version 1.0.0 and follows the Azure SDK's typical release cadence, with continuous improvements and feature additions.
Warnings
- breaking The SDK was extensively rewritten in version 1.0.0-beta.1 to align with other Azure SDKs. All APIs changed significantly from previous preview versions (e.g., `azure-cognitiveservices-vision-computervision`). Users migrating from older SDKs or preview versions must update their code according to the new API surface.
- gotcha Authentication credentials (endpoint and key) must be explicitly passed to the `ImageAnalysisClient` constructor. The library does not automatically read `VISION_ENDPOINT` and `VISION_KEY` from environment variables, although samples often show reading them into variables for good practice.
- gotcha Certain visual features, like 'Caption' or 'Dense Captions', require your Computer Vision resource to be deployed in a GPU-supported Azure region. If your resource is in an unsupported region, requests for these features will fail with a `400 Bad Request` error.
- gotcha Image analysis has specific constraints: supported formats (JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO), file size (less than 20 MB), and dimensions (between 50x50 and 16000x16000 pixels). Exceeding these limits will result in `400 Bad Request` errors.
- gotcha For Optical Character Recognition (OCR) on document-heavy content like PDFs, Office documents, or HTML, use the Azure Document Intelligence service and its specialized Read model instead of the Image Analysis service. The Image Analysis OCR (`VisualFeatures.READ`) is optimized for general images.
- deprecated Older Computer Vision API versions (1.0, 2.0, 3.0, 3.1, and 3.2 preview) are scheduled for retirement. The Image Analysis 4.0 Preview APIs (e.g., `2023-04-01-preview`) will be retired on March 31, 2025. This library (`azure-ai-vision-imageanalysis`) targets the GA 4.0 API.
Install
-
pip install azure-ai-vision-imageanalysis -
pip install azure-identity -
pip install aiohttp
Imports
- ImageAnalysisClient
from azure.ai.vision.imageanalysis import ImageAnalysisClient
- VisualFeatures
from azure.ai.vision.imageanalysis.models import VisualFeatures
- AzureKeyCredential
from azure.ai.vision.imageanalysis import AzureKeyCredential
from azure.core.credentials import AzureKeyCredential
- DefaultAzureCredential
from azure.ai.vision.imageanalysis import DefaultAzureCredential
from azure.identity import DefaultAzureCredential
Quickstart
import os
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential
# Set environment variables or replace with actual values
# For authentication with AzureKeyCredential
vision_endpoint = os.environ.get('VISION_ENDPOINT', 'YOUR_VISION_ENDPOINT')
vision_key = os.environ.get('VISION_KEY', 'YOUR_VISION_KEY')
# For authentication with DefaultAzureCredential (uncomment and configure if needed)
# from azure.identity import DefaultAzureCredential
# credential = DefaultAzureCredential()
# Authenticate the client
# Using AzureKeyCredential (most common for quickstarts)
credential = AzureKeyCredential(vision_key)
client = ImageAnalysisClient(endpoint=vision_endpoint, credential=credential)
# Image to analyze
image_url = "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png"
print("Analyzing image from URL...")
# Analyze the image for a caption
result = client.analyze_from_url(
image_url,
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
)
if result.caption is not None:
print(f" Caption: '{result.caption.text}' (confidence: {result.caption.confidence:.2f})")
if result.tags is not None:
print(" Tags:")
for tag in result.tags.list:
print(f" '{tag.name}' (confidence: {tag.confidence:.2f})")