tldextract

5.3.1 verified Tue May 12 auth: no python install: verified quickstart: verified

tldextract accurately separates a URL's subdomain, domain, and public suffix, using the Public Suffix List (PSL). It handles edge cases often missed by naive parsing methods. By default, it supports public ICANN TLDs and their exceptions, with optional support for private domains. The current version is 5.3.1, and the library maintains an active development and release cadence.

pip install tldextract

Common errors

error ModuleNotFoundError: No module named 'tldextract' ↓

cause The 'tldextract' package is not installed in the current Python environment where you are trying to use it.

fix

pip install tldextract

error ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed ↓

cause Your system cannot verify the SSL certificate when tldextract attempts to download the Public Suffix List, often due to corporate proxies, firewall rules, or outdated/missing CA certificates.

fix

Ensure your system's CA certificates are up-to-date, configure the REQUESTS_CA_BUNDLE environment variable, or for development purposes, disable SSL verification for PSL updates (e.g., tldextract.update_public_suffix_list(requests_session=requests.Session(), extra_kwargs={'verify': False})).

error OSError: [Errno 13] Permission denied ↓

cause The user running the Python process lacks write permissions to the default directory where tldextract attempts to cache the Public Suffix List data.

fix

Set a custom writable cache directory using tldextract.set_cache_dir('/path/to/a/writable/directory') before calling tldextract.extract(), or adjust the permissions for the default cache location.

Warnings

breaking The `ExtractResult` object changed from a `namedtuple` to a `dataclass` in v5.0.0. This means direct indexing, slicing, or unpacking the result object will raise a `TypeError`. ↓

fix Access fields by attribute name (e.g., `result.subdomain`, `result.domain`, `result.suffix`) instead of indexing or unpacking.

breaking The `ExtractResult` object gained a fourth field, `is_private: bool`, in v4.0.0. Code that unpacks the result expecting only 3 fields will break. ↓

fix If unpacking, adjust to expect four fields, or preferably, access fields by attribute name to avoid issues with future field additions.

deprecated The `registered_domain` property on `ExtractResult` was deprecated in v5.3.0. It will be removed in a future major version. ↓

fix Use the `top_domain_under_public_suffix` property instead, which has the same behavior but a more accurate name.

breaking Support for Python 3.9 was dropped in v5.3.1, and Python 3.8 was dropped in v5.1.3. The library now requires Python 3.10 or newer. ↓

fix Upgrade your Python environment to 3.10 or later to continue using the latest `tldextract` versions.

gotcha On its first run, `tldextract` fetches the latest Public Suffix List via an HTTP request and caches it indefinitely in `$HOME/.cache/python-tldextract`. This can cause initial delays or network dependencies in environments where this behavior is not expected. ↓

fix To control caching, specify `cache_dir` when initializing `tldextract.TLDExtract()` or manage the `TLDEXTRACT_CACHE` environment variable. You can also explicitly trigger an update via `tldextract --update` CLI command.

gotcha `tldextract` is lenient and performs minimal URL validation. It will attempt to extract components from any string, including partial or malformed URLs, prioritizing ease of use over strict validation. ↓

fix If strict URL validation is required, pre-process the input string with a dedicated URL validation library (e.g., `urllib.parse.urlsplit`) before passing it to `tldextract`.

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) wheel - 0.68s 22.0M

3.10 alpine (musl) - - 0.68s 21.9M

3.10 slim (glibc) wheel 2.3s 0.51s 22M

3.10 slim (glibc) - - 0.48s 22M

3.11 alpine (musl) wheel - 0.94s 24.1M

3.11 alpine (musl) - - 0.92s 24.0M

3.11 slim (glibc) wheel 2.5s 0.77s 25M

3.11 slim (glibc) - - 0.69s 24M

3.12 alpine (musl) wheel - 1.03s 15.9M

3.12 alpine (musl) - - 1.03s 15.8M

3.12 slim (glibc) wheel 2.1s 0.97s 16M

3.12 slim (glibc) - - 0.99s 16M

3.13 alpine (musl) wheel - 1.08s 15.6M

3.13 alpine (musl) - - 1.02s 15.4M

3.13 slim (glibc) wheel 2.1s 0.97s 16M

3.13 slim (glibc) - - 1.01s 16M

3.9 alpine (musl) wheel - 0.65s 21.0M

3.9 alpine (musl) - - 1.04s 21.1M

3.9 slim (glibc) wheel 2.6s 0.53s 22M

3.9 slim (glibc) - - 0.51s 22M

Imports

tldextract
```
import tldextract
```
ExtractResult
wrong
```
from tldextract.tldextract import ExtractResult
```
correct
```
from tldextract import ExtractResult
```
As of v5.2.0, ExtractResult was explicitly added to the public interface for easier import.
update
```
from tldextract import update
```
As of v5.2.0, 'update' function was explicitly added to the public interface.

Quickstart verified last tested: 2026-04-24

Demonstrates basic URL component extraction using `tldextract.extract()` and accessing the results as attributes of the `ExtractResult` object.

import tldextract

# Basic extraction
extract_result = tldextract.extract('http://forums.news.cnn.com/')
print(f"Subdomain: {extract_result.subdomain}")
print(f"Domain: {extract_result.domain}")
print(f"Suffix: {extract_result.suffix}")
print(f"Full host: {extract_result.fqdn}")

# Example with private suffix
private_extract = tldextract.extract('waiterrant.blogspot.com')
print(f"\nPrivate domain example: {private_extract.subdomain}.{private_extract.domain}.{private_extract.suffix}")