rfc3986

raw JSON →
2.0.0 verified Tue May 12 auth: no python install: verified quickstart: stale

rfc3986 is a Python implementation of RFC 3986 including validation and authority parsing. This module also supports RFC 6874, which adds support for zone identifiers to IPv6 Addresses. It provides APIs for parsing, validating, and building URIs, with convenience methods for `urllib.parse` compatibility. The current version is 2.0.0, released in January 2022, and the project appears to be actively maintained.

pip install rfc3986
error rfc3986.exceptions.InvalidURIError: Invalid scheme component: 'http://'
cause The input string provided to `URIReference.from_string` does not conform to RFC 3986 standards for a valid URI, or a specific component is malformed.
fix
from rfc3986 import URIReference try: uri = URIReference.from_string('https://example.com/path') # Correct URI except URIReference.InvalidURIError as e: print(f"Error: {e}")
error ImportError: cannot import name 'parse_uri' from 'rfc3986'
cause The function or class `parse_uri` does not exist at the top level of the `rfc3986` module; the primary method for parsing URIs is the `from_string` class method of `URIReference` (or `URI`).
fix
from rfc3986 import URIReference uri = URIReference.from_string('https://example.com')
error AttributeError: 'NoneType' object has no attribute 'host'
cause This error occurs when attempting to access a sub-component (like `host`) on an optional URI part (like `authority`) that is `None` because it's not present in the parsed URI reference (e.g., for relative URIs).
fix
from rfc3986 import URIReference uri = URIReference.from_string('/path/to/resource') # A relative URI without authority if uri.authority: print(uri.authority.host) else: print("No authority component for this URI")
gotcha rfc3986 strictly adheres to RFC 3986, which can result in different parsing behavior for malformed or non-standard URIs compared to `urllib.parse`. Specifically, an authority component must be preceded by `//`, otherwise, `rfc3986` may interpret it as part of the path.
fix Always ensure URIs conform strictly to RFC 3986, especially by including `//` before the authority. For compatibility with `urllib.parse`'s looser parsing, use `rfc3986.urlparse()` but be aware of its limitations and the stricter RFC 3986 interpretation.
gotcha The library does not support Internationalized Resource Identifiers (IRIs) as defined in RFC 3987. It focuses solely on RFC 3986.
fix If IRI support is required, consider using a different library (e.g., `rfc3987` or `uritools` with caution) or pre-process IRIs to ensure they are RFC 3986 compatible before passing them to `rfc3986`.
gotcha The `uri_reference.is_valid()` method might, in some edge cases, accept invalid hostnames or out-of-range port numbers (according to open GitHub issues).
fix For critical security-sensitive URI validation, do not solely rely on `is_valid()`. Implement additional checks for hostname formats and port number ranges, or combine with a custom `Validator` instance that explicitly defines allowed components.
gotcha The `copy_with` method on parsed URI objects (from `uri_reference` or `urlparse`) replaces existing components rather than extending them. For example, adding a path segment replaces the entire path.
fix If you need to extend path components or append to query parameters, retrieve the existing components, manipulate them as strings or lists, and then pass the complete new value to `copy_with`. Alternatively, use `URIBuilder` and its methods for more controlled incremental building.
gotcha The `allow_schemes()` method in `rfc3986.validators.Validator` expects individual scheme strings or an unpacked iterable as arguments. Passing a list object directly (e.g., `allow_schemes(['https'])`) results in an `AttributeError` when the library attempts to call `.lower()` on the list object instead of a string.
fix To specify schemes for validation, pass each scheme as a separate argument (e.g., `validator.allow_schemes('https', 'http')`). If you have a list of schemes, unpack it using the `*` operator: `validator.allow_schemes(*['https', 'http'])`.
gotcha The `Validator.allow_schemes()` method expects scheme names as individual string arguments (e.g., `allow_schemes('https', 'http')`), not as a single list argument. Passing a list (e.g., `allow_schemes(['https'])`) will lead to an `AttributeError: 'list' object has no attribute 'lower'` when the method attempts to normalize the scheme.
fix When using `Validator.allow_schemes()`, provide each scheme as a separate string argument. If you have a list of schemes, unpack it using the `*` operator (e.g., `validator.allow_schemes(*my_schemes)` or `validator.allow_schemes('https')` for a single scheme).
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 0.12s 18.0M
3.10 alpine (musl) - - 0.14s 18.0M
3.10 slim (glibc) wheel 1.5s 0.07s 19M
3.10 slim (glibc) - - 0.09s 19M
3.11 alpine (musl) wheel - 0.31s 19.9M
3.11 alpine (musl) - - 0.38s 19.9M
3.11 slim (glibc) wheel 1.6s 0.30s 20M
3.11 slim (glibc) - - 0.28s 20M
3.12 alpine (musl) wheel - 0.21s 11.7M
3.12 alpine (musl) - - 0.23s 11.7M
3.12 slim (glibc) wheel 1.4s 0.24s 12M
3.12 slim (glibc) - - 0.26s 12M
3.13 alpine (musl) wheel - 0.19s 11.5M
3.13 alpine (musl) - - 0.29s 11.4M
3.13 slim (glibc) wheel 1.6s 0.21s 12M
3.13 slim (glibc) - - 0.23s 12M
3.9 alpine (musl) wheel - 0.10s 17.5M
3.9 alpine (musl) - - 0.13s 17.5M
3.9 slim (glibc) wheel 1.7s 0.09s 18M
3.9 slim (glibc) - - 0.11s 18M

This quickstart demonstrates how to parse a URI string into a `URIReference` object, access its components, validate it using a `Validator` instance with custom rules, and construct a new URI using `URIBuilder`.

from rfc3986 import uri_reference, validators

# Parsing a URI Reference
uri_str = 'https://user:pass@example.com:8080/path/to/resource?key=value#fragment'
uri = uri_reference(uri_str)

print(f"Scheme: {uri.scheme}") # Output: https
print(f"Host: {uri.host}")     # Output: example.com
print(f"Path: {uri.path}")     # Output: /path/to/resource
print(f"Query: {uri.query}")   # Output: key=value

# Validating a URI
validator = validators.Validator().allow_schemes(['https']).allow_hosts(['example.com'])

if validator.validate(uri):
    print("URI is valid according to custom rules.")
else:
    print("URI is NOT valid according to custom rules.")

# Building a URI
from rfc3986 import URIBuilder

builder = (URIBuilder()
           .add_scheme('mailto')
           .add_path('user@domain.com'))

mailto_uri = builder.finalize()
print(f"Built URI: {mailto_uri.unsplit()}") # Output: mailto:user@domain.com