ocrmac
ocrmac is a Python wrapper designed to extract text from images specifically on macOS systems. It leverages Apple's Vision framework to provide fast and accurate Optical Character Recognition (OCR) capabilities. The library is currently at version 1.0.1 and maintains an active development and release cadence, with updates addressing features and bug fixes. It requires macOS 10.15 (Catalina) or newer.
Common errors
-
ValueError: Invalid image format. Image must be a path or a PIL image.
cause Attempting to pass an image object that is neither a string path nor a `PIL.Image.Image` object to the OCR functions.fixEnsure your image input is either a string representing the file path or a valid Pillow (PIL) Image object. -
ValueError: Invalid recognition level. Recognition level must be 'accurate' or 'fast'.
cause An unrecognized string was passed to the `recognition_level` parameter.fixSet `recognition_level` to either `'accurate'` or `'fast'`. -
If you set a wrong language you will see an error message showing the languages available.
cause The `language_preference` list contains an unsupported or incorrectly formatted language code.fixProvide a list of valid BCP-47 language identifiers (e.g., `['en-US']`, `['zh-Hans', 'de-DE']`). The error message itself might guide you to the available options.
Warnings
- breaking ocrmac is a macOS-exclusive library, relying on Apple's Vision framework. It will not function on Windows, Linux, or older macOS versions (requires macOS 10.15+).
- gotcha Using an unsupported or misspelled language code in `language_preference` will result in an error, often indicating available languages.
- gotcha The `recognition_level` parameter for OCR can be set to 'fast' or 'accurate'. 'Fast' provides quicker results but may sacrifice precision, while 'accurate' offers higher precision at a slower speed.
- gotcha ocrmac version 1.0.0 introduced support for using either the 'vision' or 'livetext' framework as the backend. Not specifying or incorrectly specifying the 'framework' parameter might lead to unexpected behavior or errors if you intend to use the newer LiveText features.
Install
-
pip install ocrmac
Imports
- ocrmac
from ocrmac import ocrmac
- OCR
from ocrmac.ocrmac import OCR
Quickstart
from ocrmac import ocrmac
from PIL import Image
import os
# Create a dummy image for demonstration purposes
dummy_image_path = 'test_image.png'
img = Image.new('RGB', (60, 30), color = 'red')
from PIL import ImageDraw, ImageFont
d = ImageDraw.Draw(img)
try:
# Use a system font path for broader compatibility
font = ImageFont.truetype("/System/Library/Fonts/Supplemental/Arial.ttf", 12)
except IOError:
font = ImageFont.load_default() # Fallback
d.text((10,10), "Hello OCR!", fill=(0,0,0), font=font)
img.save(dummy_image_path)
# Perform OCR using the OCR class
try:
annotations = ocrmac.OCR(dummy_image_path, language_preference=['en-US']).recognize()
print("Detected annotations:")
for text, confidence, bounding_box in annotations:
print(f" Text: '{text}', Confidence: {confidence:.2f}, Bounding Box: {bounding_box}")
# Alternatively, use the functional interface
text_output = ocrmac.text_from_image(dummy_image_path, language_preference=['en-US'])
print("\nDetected text (functional interface):")
print(text_output)
finally:
# Clean up the dummy image
if os.path.exists(dummy_image_path):
os.remove(dummy_image_path)