PyObjC NaturalLanguage Framework

12.1 · active · verified Tue Apr 14

PyObjC is a bridge between Python and Objective-C, enabling Python scripts to leverage Apple's Cocoa frameworks. This particular library, `pyobjc-framework-naturallanguage` (version 12.1), provides Python wrappers for the macOS NaturalLanguage framework, allowing applications to access native natural language processing capabilities on macOS. PyObjC releases generally align with macOS SDK updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates basic text tokenization and language recognition using the NaturalLanguage framework. It first tokenizes a sentence into individual words and then identifies the dominant language of the text.

import NaturalLanguage
from Foundation import NSMakeRange

text_to_analyze = "The quick brown fox jumps over the lazy dog. This is a second sentence."

# --- Tokenization Example ---
# Create an NLTokenizer instance for word units
tokenizer = NaturalLanguage.NLTokenizer.alloc().initWithUnit_(NaturalLanguage.NLTokenUnitWord)

# Set the string to be tokenized
tokenizer.setString_(text_to_analyze)

tokens = []
# Enumerate tokens using a Python callable as the Objective-C block
def token_block_handler(token_range, flags):
    start = token_range.location
    length = token_range.length
    token_text = text_to_analyze[start : start + length]
    tokens.append(token_text)
    return True  # Return True to continue enumeration

tokenizer.enumerateTokensInRange_usingBlock_(
    NSMakeRange(0, len(text_to_analyze)),
    token_block_handler
)

print(f"Original text: '{text_to_analyze}'")
print(f"Tokens (words): {tokens}")

# --- Language Recognition Example ---
lang_recognizer = NaturalLanguage.NLLanguageRecognizer.alloc().init()
lang_recognizer.processString_(text_to_analyze)
dominant_language = lang_recognizer.dominantLanguage()
print(f"Dominant language: {dominant_language}")

view raw JSON →