TLD Extraction Library
The `tld` library provides functions to reliably extract the top-level domain (TLD) from a given URL, along with subdomains and full domain names. It leverages the Public Suffix List to ensure accuracy. The current version is 0.13.2, and it typically sees several updates per year, though major versions may have longer gaps.
Warnings
- gotcha The `tld` list needs to be updated periodically to remain accurate. New TLDs are introduced regularly. Failing to update can lead to incorrect TLD extraction for newer domains.
- breaking The default value for the `search_private` parameter in `get_tld` changed from `True` to `False`.
- deprecated The function `tld.update_tld_list` was replaced by `tld.utils.update_tld_names`.
- breaking Python 3.6 support was dropped.
Install
-
pip install tld
Imports
- get_tld
from tld import get_tld
- update_tld_names
from tld.utils import update_tld_names
Quickstart
from tld import get_tld
from tld.utils import update_tld_names
# It's good practice to update the TLD names regularly
# This fetches the latest Public Suffix List
# For production, consider running this in a scheduled job, not on every startup.
# update_tld_names()
url1 = "http://www.google.co.uk"
tld1 = get_tld(url1)
print(f"TLD for '{url1}': {tld1}")
url2 = "https://sub.domain.example.com/path?query=1"
obj2 = get_tld(url2, as_object=True)
print(f"TLD for '{url2}': {obj2.tld}")
print(f"Subdomain: {obj2.subdomain}")
print(f"Domain: {obj2.domain}")
print(f"Full domain: {obj2.fld}")
try:
invalid_url = "ftp://invalid-url"
get_tld(invalid_url, fail_silently=False)
except Exception as e:
print(f"Error extracting TLD for '{invalid_url}': {e}")