purl
purl is an immutable URL class designed for easy URL-building and manipulation in Python. It provides a clean API for interrogation and transformation of URLs, aiming to offer a more intuitive experience than the standard library's urlparse and urllib modules. The current version is 1.6, released in May 2021, and the project is actively maintained.
Warnings
- gotcha URL objects are immutable. Any method that appears to 'modify' the URL (e.g., `path()`, `query_param()`, `add_query_param()`) will return a *new* `URL` instance with the changes, leaving the original object untouched. Users must assign the result of these operations to a new variable or reassign the original variable.
- gotcha Accessor methods (e.g., `u.path()`) are overloaded to also act as mutator methods (e.g., `u.path('new/path')`). When used as a mutator, they will return a *new* `URL` instance, consistent with the immutability principle. This can be confusing if not explicitly understood, as it does not modify the object in place.
- gotcha The `purl` library (for general URL manipulation) should not be confused with `packageurl-python` (for Package URLs, or PURLs, specification). They are distinct libraries with different purposes, despite the similar naming. Ensure you import `from purl import URL` for this library.
Install
-
pip install purl
Imports
- URL
from purl import URL
- Template
from purl import Template
Quickstart
from purl import URL
# Constructing a URL
u = URL('https://www.example.com/path/to/resource?id=123&lang=en#section-1')
print(f"Original URL: {u}")
# Accessing parts of the URL
print(f"Scheme: {u.scheme()}")
print(f"Host: {u.host()}")
print(f"Path: {u.path()}")
print(f"Query parameter 'id': {u.query_param('id')}")
# Modifying the URL (returns a *new* URL object)
new_u = u.query_param('lang', 'fr').remove_query_param('id').fragment('top')
print(f"Modified URL: {new_u}")
print(f"Original URL (unchanged): {u}")
# Chaining operations
chained_url = URL('http://api.example.com').path('v1').path_segment(1, 'users').query_param('sort', 'asc')
print(f"Chained URL: {chained_url}")