Pyap2: Address Parser
Pyap2 is a maintained fork of pyap, a regex-based library for parsing US, CA, and UK addresses. It provides a robust way to extract structured address data from unstructured text. The fork adds typing support, handles more address formats and edge cases, and is actively developed. The current version is 0.2.12, with frequent minor releases addressing new formats and edge cases.
Common errors
-
ModuleNotFoundError: No module named 'pyap'
cause You are trying to import from the original `pyap` package, but you have installed `pyap2` (the fork) or vice-versa, or neither is installed.fixIf you intend to use `pyap2`, ensure you run `pip install pyap2` and change all imports to `from pyap2 import ...`. If you intend to use the original `pyap`, run `pip install pyap` and use `from pyap import ...`. -
IndexError: list index out of range (when trying to access addresses[0])
cause `pyap2.parse_address` returns an empty list when no addresses are found in the input string, and you're trying to access an element from an empty list.fixAlways check if the returned list is not empty before attempting to access elements, e.g., `addresses = pyap2.parse_address(...)` then `if addresses: first_address = addresses[0]`. -
No addresses found (empty list returned) even for a seemingly valid address.
cause This is often caused by an incorrect or missing `country` parameter, which results in the regex patterns not matching correctly for the given address locale.fixEnsure you provide the correct `country` parameter (e.g., `country='US'`) corresponding to the address you are trying to parse. -
Multiple addresses in a string are being parsed as a single, incorrect address.
cause The parser needs explicit newlines to distinguish between separate addresses within a single input string.fixInsert newline characters (\n) between each distinct address in your input string. For example: `address_string = '123 Main St, Anytown, US\n456 Oak Ave, Otherville, US'`.
Warnings
- breaking Pyap2 is a maintained fork of the original `pyap` library. If migrating from `pyap`, you must change your package installation (`pip install pyap2`) and all import statements from `from pyap import ...` to `from pyap2 import ...`.
- gotcha The `parse_address` function relies on regex and is highly sensitive to the input address format. For multiple addresses within a single string, each address must be separated by a newline to be parsed correctly. Otherwise, they might be treated as a single, malformed address.
- gotcha The `country` parameter in `parse_address` is critical for accurate results. Failing to specify it, or providing an incorrect country, can lead to no addresses being found or incorrect parsing, as the regex patterns are country-specific.
- gotcha Pyap2 currently only supports address parsing for the United States (US), Canada (CA), and the United Kingdom (UK). Addresses from other countries will not be parsed correctly, or at all.
Install
-
pip install pyap2
Imports
- parse_address
from pyap import parse_address
from pyap2 import parse_address
Quickstart
import pyap2
address_text = """
6162 E. Mockingbird Ln
Dallas, TX 75214
"""
# parse_address returns a list of Address objects
addresses = pyap2.parse_address(address_text, country='US')
if addresses:
for addr in addresses:
print(f"Parsed Address: {addr.full_address}")
print(f" Street: {addr.street_number} {addr.street_name} {addr.street_type}")
print(f" City: {addr.city}")
print(f" Region: {addr.region1}")
print(f" Postcode: {addr.postcode}")
print(f" As Dictionary: {addr.as_dict()}")
else:
print("No address found or parsed.")