PyObjC Vision Framework

12.1 · active · verified Sun Apr 12

pyobjc-framework-vision provides Python wrappers for Apple's Vision framework on macOS. It is part of the PyObjC project, a bidirectional bridge enabling Python scripts to interact with Objective-C libraries, including macOS Cocoa frameworks. The current version is 12.1 and it maintains an active release cadence, typically aligning with macOS SDK updates and Python version support.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `pyobjc-framework-vision` to perform text recognition on an image. It showcases the typical pattern of loading an image, creating a Vision request handler, defining a request, and processing the results. A dummy image path is used for a runnable example; replace it with a path to an actual image for real-world testing.

import os
import Vision
from Foundation import NSURL, NSDictionary
from Cocoa import CIImage # Often found in Quartz, but CIImage is exposed via Cocoa as well for convenience

def recognize_text_in_image(image_path):
    if not os.path.exists(image_path):
        print(f"Error: Image not found at {image_path}")
        return

    # Convert Python path to NSURL
    input_url = NSURL.fileURLWithPath_(image_path)

    # Create CIImage from the URL
    input_image = CIImage.imageWithContentsOfURL_(input_url)
    if input_image is None:
        print(f"Error: Could not create CIImage from {image_path}")
        return

    # Create a Vision request handler
    vision_options = NSDictionary.dictionaryWithDictionary_({})
    vision_handler = Vision.VNImageRequestHandler.alloc().initWithCIImage_options_(
        input_image, vision_options
    )

    # Prepare a text recognition request
    results = []
    def completion_handler(request, error):
        if error:
            print(f"Vision request error: {error}")
        else:
            for observation in request.results():
                if isinstance(observation, Vision.VNRecognizedTextObservation):
                    for text_candidate in observation.topCandidates_(1):
                        results.append(text_candidate.string())

    text_request = Vision.VNRecognizeTextRequest.alloc().initWithCompletionHandler_(
        completion_handler
    )
    
    # Perform the request
    error_ptr = None # PyObjC expects None for output pointers that aren't being used for input
    success, error = vision_handler.performRequests_error_([text_request], error_ptr)

    if not success:
        print(f"Failed to perform Vision request: {error}")
    
    return results

# Example usage:
# Create a dummy image file for demonstration
# In a real scenario, replace this with a path to an actual image file.
# For a real image with text, you might create one using Pillow or similar.
dummy_image_path = os.environ.get('VISION_TEST_IMAGE_PATH', 'dummy_image_with_text.png')
if not os.path.exists(dummy_image_path):
    print(f"Please set VISION_TEST_IMAGE_PATH or create a file named '{dummy_image_path}' to run this example.")
    print("Example: create a simple PNG with some text using an image editor.")
else:
    print(f"Analyzing image: {dummy_image_path}")
    detected_text = recognize_text_in_image(dummy_image_path)
    if detected_text:
        print("Detected text:")
        for text in detected_text:
            print(f"- {text}")
    else:
        print("No text detected or an error occurred.")

view raw JSON →