Azure AI Content Understanding

raw JSON →
1.1.0 verified Sat May 09 auth: no python

Azure AI Content Understanding Python SDK (v1.1.0) from Microsoft. Provides client for analyzing and understanding content using Azure AI services. Supports synchronous and asynchronous analysis of documents, images, and text. Released on a monthly cadence.

pip install azure-ai-contentunderstanding
error AttributeError: module 'azure.ai.contentunderstanding' has no attribute 'ContentUnderstandingClient'
cause Import path is wrong; ContentUnderstandingClient is in the top-level package but the module may not be installed correctly.
fix
Ensure you've installed azure-ai-contentunderstanding and import with: from azure.ai.contentunderstanding import ContentUnderstandingClient
error azure.core.exceptions.HttpResponseError: (InvalidRequest) 'content_type' is required for binary content
cause When passing binary data (e.g., BytesIO), AnalyzeOptions must specify content_type.
fix
Add content_type argument: AnalyzeOptions(content_type='application/pdf')
error azure.identity._exceptions.CredentialUnavailableError: DefaultAzureCredential failed to retrieve a token
cause DefaultAzureCredential requires environment variables or managed identity; none are set.
fix
Set AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET or use AzureCliCredential()
breaking In v1.1.0, the parameter 'content_type' in AnalyzeOptions is required for binary input; omitting it causes a 400 error.
fix Always pass content_type='application/pdf' (or appropriate MIME) when providing binary data.
deprecated The method 'begin_analyze' was replaced by 'analyze' in v1.0.0; 'begin_analyze' no longer exists.
fix Use client.analyze() instead of client.begin_analyze().
gotcha DefaultAzureCredential requires environment variables AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET to be set. If not, it raises CredentialUnavailableError.
fix Set the required environment variables or use a different credential class like AzureCliCredential.

Initializes client and analyzes a PDF document.

from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeOptions
import os

endpoint = os.environ.get('AZURE_CONTENT_UNDERSTANDING_ENDPOINT', '')
credential = DefaultAzureCredential()

client = ContentUnderstandingClient(endpoint, credential)

# Analyze a document
with open('sample.pdf', 'rb') as f:
    result = client.analyze(f, AnalyzeOptions(content_type='application/pdf'))
print(result)