USaddress

0.5.16 · active · verified Thu Apr 09

USaddress is a Python library designed for parsing unstructured United States address strings into their individual components, employing advanced Natural Language Processing (NLP) methods. It utilizes a probabilistic model, specifically Conditional Random Fields, to make educated guesses in identifying address parts, even in complex cases. The library's current version is 0.5.16, and it is actively maintained.

Warnings

Install

Imports

Quickstart

This example demonstrates both the `parse()` method, which returns a list of (value, label) tuples, and the `tag()` method, which returns a more structured `OrderedDict` of components and an inferred address type.

import usaddress

address_string = "123 Main St. Suite 100 Chicago, IL 60601"

# The .parse() method returns a list of (value, label) tuples
parsed_address = usaddress.parse(address_string)
print("Parsed (tuples):")
print(parsed_address)

# The .tag() method returns an OrderedDict of components and an address type
tagged_address, address_type = usaddress.tag(address_string)
print("\nTagged (OrderedDict & Type):")
print(tagged_address)
print(f"Address Type: {address_type}")

view raw JSON →