{"id":1889,"library":"usaddress","title":"USaddress","description":"USaddress is a Python library designed for parsing unstructured United States address strings into their individual components, employing advanced Natural Language Processing (NLP) methods. It utilizes a probabilistic model, specifically Conditional Random Fields, to make educated guesses in identifying address parts, even in complex cases. The library's current version is 0.5.16, and it is actively maintained.","status":"active","version":"0.5.16","language":"en","source_language":"en","source_url":"https://github.com/datamade/usaddress","tags":["address parsing","NLP","US addresses","geocoding utility"],"install":[{"cmd":"pip install usaddress","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core dependency for the Conditional Random Fields model used in parsing.","package":"python-crfsuite","optional":false}],"imports":[{"note":"The library is imported directly as 'usaddress'.","symbol":"usaddress","correct":"import usaddress"}],"quickstart":{"code":"import usaddress\n\naddress_string = \"123 Main St. Suite 100 Chicago, IL 60601\"\n\n# The .parse() method returns a list of (value, label) tuples\nparsed_address = usaddress.parse(address_string)\nprint(\"Parsed (tuples):\")\nprint(parsed_address)\n\n# The .tag() method returns an OrderedDict of components and an address type\ntagged_address, address_type = usaddress.tag(address_string)\nprint(\"\\nTagged (OrderedDict & Type):\")\nprint(tagged_address)\nprint(f\"Address Type: {address_type}\")","lang":"python","description":"This example demonstrates both the `parse()` method, which returns a list of (value, label) tuples, and the `tag()` method, which returns a more structured `OrderedDict` of components and an inferred address type."},"warnings":[{"fix":"Catch `RepeatedLabelError` when using `tag()` and consider using `parse()` for more granular (though less aggregated) output, or preprocess ambiguous input strings.","message":"The `usaddress.tag()` method may raise a `RepeatedLabelError` if an address string contains multiple components that are assigned the same label and cannot be logically concatenated into a single value (e.g., two distinct 'StreetName' labels without an 'IntersectionSeparator'). This often indicates an ambiguous or improperly formatted input address.","severity":"gotcha","affected_versions":"All versions 0.5.x and likely earlier"},{"fix":"Ensure input addresses are strictly within the United States. For international address parsing, use a library specifically designed for global addresses (e.g., libpostal Python bindings).","message":"usaddress is designed specifically for 'United States address strings'. Attempting to parse international addresses will likely result in incorrect component labeling or parsing failures. The underlying model is trained on US address patterns.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Do not rely on usaddress for address validation. For validation, integrate with a dedicated address verification service (e.g., USPS API, SmartyStreets, etc.) after parsing with usaddress.","message":"The library uses a probabilistic model to identify address components, but it 'cannot identify address components with perfect accuracy, nor can it verify that a given address is correct/valid.' It provides structured components but does not perform address validation (e.g., checking if an address is mailable or exists).","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}