pyap: Python Address Parser
Pyap is an MIT Licensed text processing library, written in Python, for detecting and parsing addresses using regular expressions. It currently supports US, Canadian, and British address formats. The library is at version 0.3.1, with its last release in September 2020, indicating a slow release cadence or maintenance status for the original package, though forks exist.
Common errors
-
pyap.exceptions.NoCountrySelected: No country specified during library initialization.
cause The `pyap.parse()` function was called without the required `country` argument.fixAlways specify the `country` parameter, e.g., `pyap.parse(text, country='US')`. -
pyap.exceptions.CountryDetectionMissing: Detection rules for country "XX" not found.
cause An unsupported or incorrect country code was provided to `pyap.parse()`.fixUse one of the supported country codes: 'US' (United States), 'CA' (Canada), or 'GB' (Great Britain). For other countries, this library does not provide support. -
AttributeError: 'list' object has no attribute 'as_dict'
cause The `pyap.parse()` function returns a list of `Address` objects, not a single `Address` object. The `as_dict()` method is available on individual `Address` objects, not on the list itself.fixIterate through the list of addresses or access a specific element by index before calling `as_dict()`. For example: `addresses = pyap.parse(text, country='US')` followed by `for address in addresses: print(address.as_dict())` or `if addresses: print(addresses[0].as_dict())`.
Warnings
- deprecated The original `pyap` library (v0.3.1) has not been updated since September 2020. For more active maintenance, typing support, and handling of additional address formats/edge cases, consider using community-maintained forks like `pyap2` (`pip install pyap2`).
- gotcha Pyap is solely based on regular expressions and does not validate addresses against external databases or lists of cities/street names. This can lead to false positives where strings that *look* like addresses are parsed, even if they are not real-world locations (e.g., '1 SPIRITUAL HEALER DR SHARIF NSAMBU SPECIALISING IN' might be detected).
- gotcha The `country` parameter in `pyap.parse()` is mandatory. Failing to provide a supported country code will result in an exception.
Install
-
pip install pyap
Imports
- pyap
import pyap
Quickstart
import pyap
text_with_address = """This is some sample text containing an address: 225 E. John Carpenter Freeway, Suite 1500 Irving, Texas 75062. And more text."""
# Parse addresses for the US
addresses = pyap.parse(text_with_address, country='US')
for address in addresses:
print(f"Found Address: {address}")
print(f"Parsed Components: {address.as_dict()}")
# Example with a different country (Canada)
canada_address_text = """Meeting at 4998 Stairstep Lane Toronto ON, tomorrow."""
canada_addresses = pyap.parse(canada_address_text, country='CA')
for address in canada_addresses:
print(f"Found Canadian Address: {address}")