Azure AI Translation Document Client Library for Python

1.1.0 · active · verified Wed Apr 15

The Azure AI Translation Document client library for Python is part of Microsoft's Azure SDK, providing functionality to integrate Document Translation capabilities into applications. It allows translation of whole documents across multiple languages and dialects while preserving the original structure and formatting. The library supports both asynchronous batch translation for multiple and complex files stored in Azure Blob Storage, and synchronous single-file translation. The current stable version is 1.1.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to perform an asynchronous batch document translation using the `DocumentTranslationClient`. It translates all documents from a specified source Azure Blob Storage container to a target container in the desired language. Ensure your Azure Translator resource is configured with a system-assigned managed identity and granted 'Storage Blob Data Contributor' role to access your storage account, or use SAS tokens with appropriate permissions.

import os
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient, DocumentTranslationInput, TranslationTarget

# Set up environment variables for endpoint, key, and container URLs
endpoint = os.environ.get("AZURE_DOCUMENT_TRANSLATION_ENDPOINT", "https://YOUR_TRANSLATOR_RESOURCE_NAME.cognitiveservices.azure.com/")
key = os.environ.get("AZURE_DOCUMENT_TRANSLATION_KEY", "YOUR_API_KEY")
source_container_url = os.environ.get("AZURE_SOURCE_CONTAINER_URL", "https://YOUR_STORAGE_ACCOUNT.blob.core.windows.net/source?sas_token")
target_container_url = os.environ.get("AZURE_TARGET_CONTAINER_URL", "https://YOUR_STORAGE_ACCOUNT.blob.core.windows.net/target?sas_token")
target_language = "es"

# Ensure environment variables are set or provide placeholders
if not all([endpoint, key, source_container_url, target_container_url]):
    print("Please set the environment variables: AZURE_DOCUMENT_TRANSLATION_ENDPOINT, AZURE_DOCUMENT_TRANSLATION_KEY, AZURE_SOURCE_CONTAINER_URL, AZURE_TARGET_CONTAINER_URL")
    exit(1)

def begin_batch_translation():
    client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))

    inputs = [
        DocumentTranslationInput(
            source_url=source_container_url,
            targets=[
                TranslationTarget(
                    target_url=target_container_url,
                    language_code=target_language
                )
            ]
        )
    ]

    print("Submitting batch translation job...")
    poller = client.begin_translation(inputs)
    
    print(f"Job ID: {poller.id}")
    print(f"Job status: {poller.status}")

    # Wait for the job to complete
    result = poller.result()

    print("Translation job completed. Document statuses:")
    for document_status in result:
        print(f"Document ID: {document_status.id}")
        print(f"  Source document path: {document_status.source_document_path}")
        print(f"  Translated document path: {document_status.translated_document_path}")
        print(f"  Status: {document_status.status}")
        if document_status.error:
            print(f"  Error: {document_status.error.code} - {document_status.error.message}")

if __name__ == '__main__':
    begin_batch_translation()

view raw JSON →