hyperlink library
hyperlink is a pure-Python library that provides a featureful, immutable, and semantically correct URL object, adhering to RFC 3986 and RFC 3987 for URIs and IRIs. It emphasizes correctness and ease of use, providing a robust model for URL parsing, construction, and transformation. The current stable version is 21.0.0, with major releases occurring approximately annually.
Warnings
- gotcha hyperlink's URL objects are immutable. All 'modifying' methods (e.g., `replace()`, `add()`, `drop()`, `set()`) return a *new* URL object with the changes, rather than modifying the object in place.
- gotcha The `hyperlink.parse()` function, by default (since v18.0.0), returns a `DecodedURL` which automatically handles encoding/decoding of URL components. Directly instantiating `hyperlink.URL` (or `hyperlink.EncodedURL`) requires manual handling of URL-reserved characters, which can easily lead to incorrect encoding if not carefully managed.
- breaking In version 19.0.0, the library changed how 'equals sign' characters are handled in query parameter *values*. Previously, they were escaped; now, they are not, aligning with standard web form encoding. This could be a breaking change if your system relied on the old escaping behavior for parsing or comparison.
- breaking In version 20.0.0, bug fixes related to hidden state on `URL` objects mean that constructor arguments like `rooted` and `uses_netloc` may be ignored if applying them would result in an invalid or unparseable URL. The object prioritizes validity over literal argument adherence in such cases.
Install
-
pip install hyperlink
Imports
- URL
from hyperlink import URL
- parse
from hyperlink import parse
- DecodedURL
from hyperlink import DecodedURL
Quickstart
from hyperlink import parse
# Parse a URL from text
url = parse(u'http://github.com/python-hyper/hyperlink?utm_source=readthedocs')
print(f"Original URL: {url.to_text()}")
# Create a new URL by replacing components (URL objects are immutable)
secure_url = url.replace(scheme=u'https', port=443)
print(f"Secure URL: {secure_url.to_text()}")
# Navigate to a child path or resolve relative paths
org_url = secure_url.click(u'.') # Clicks to the root of the domain
print(f"Org URL: {org_url.to_text()}")
# Access query parameters
utm_source = secure_url.get(u'utm_source')[0]
print(f"UTM Source: {utm_source}")
# Build a URL from components
new_url_from_parts = URL(scheme=u'https', host=u'example.com', path=[u'api', u'v1'], query=((u'id', u'123'),))
print(f"Built from parts: {new_url_from_parts.to_text()}")