Amazon Textract Caller

0.2.4 · abandoned · verified Mon Apr 13

This library provides a simplified Python interface for making API calls to Amazon Textract, streamlining direct Textract interactions. As of its latest PyPI release (0.2.4), it primarily focuses on facilitating raw API requests and responses. However, active development has largely shifted to the `amazon-textract-textractor` library, which offers more comprehensive document parsing and utility features. The `amazon-textract-caller` package itself has not seen updates since January 2021.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `amazon-textract-caller` to invoke the Amazon Textract API for a document stored in S3. It will return the raw JSON response from Textract. Ensure your AWS credentials and region are configured (e.g., via environment variables or AWS CLI).

import os
from amazon_textract_caller import get_textract_response, TextractFeatures

# Configure AWS credentials and region (e.g., via environment variables or AWS CLI config)
# os.environ['AWS_ACCESS_KEY_ID'] = os.environ.get('AWS_ACCESS_KEY_ID', '')
# os.environ['AWS_SECRET_ACCESS_KEY'] = os.environ.get('AWS_SECRET_ACCESS_KEY', '')
# os.environ['AWS_DEFAULT_REGION'] = os.environ.get('AWS_DEFAULT_REGION', 'us-east-1')

# Replace with your actual S3 document URI (e.g., "s3://your-bucket/your-document.pdf")
s3_document_uri = "s3://YOUR_BUCKET/YOUR_DOCUMENT.pdf"

try:
    # Call Textract API with specified features
    # This package returns the raw JSON response from Textract.
    response = get_textract_response(
        input_document=s3_document_uri,
        features=[TextractFeatures.FORMS, TextractFeatures.TABLES]
    )
    print("Textract API call successful. Raw JSON response received:")
    # print(response) # Uncomment to see the full raw Textract JSON response
    print(f"Detected {len(response.get('Blocks', []))} blocks.")
    
    print("\nNote: For higher-level parsing and object models, consider the `amazon-textract-textractor` library.")

except Exception as e:
    print(f"Error during Textract API call: {e}")
    print("Ensure valid AWS credentials, correct S3 URI, and appropriate permissions.")

view raw JSON →