RFC 3987 Syntax Validation
rfc3987-syntax is a Python library (current version 1.1.0) that provides helper functions to syntactically validate strings according to the ABNF grammar defined in RFC 3987, which specifies Internationalized Resource Identifiers (IRIs). It is lightweight, permissively licensed (MIT), and leverages the Lark parsing library. The project strictly focuses on ABNF syntax validation, explicitly stating that additional semantic rules (like Unicode normalization or BiDi constraints) must be handled separately. With a relatively active release cadence, its latest major update (v1.1.0) introduced support for IRI-reference and absolute-IRI.
Warnings
- gotcha This library *only* performs syntactic validation of RFC 3987 based on ABNF grammar. It does not validate semantic rules such as Unicode Normalization Form C (NFC), Bidirectional text (BiDi) constraints, port number ranges (0-65535), valid IPv6 compression, or context-aware percent-encoding requirements. These must be enforced separately for full compliance.
- gotcha This `rfc3987-syntax` package is a distinct project and is *not* affiliated with the older, GPL-licensed `rfc3987` Python package by Daniel Gerber, which has a different scope. Ensure you are installing `rfc3987-syntax` to avoid licensing conflicts or unexpected API differences.
- breaking Version 1.1.0 introduced explicit support for `IRI-reference` and `absolute-IRI` terms and fixed a bug related to single quote sub-delimiters. While primarily adding functionality, these changes *could* subtly alter validation results for certain inputs that were previously considered invalid or parsed differently in older versions.
Install
-
pip install rfc3987-syntax
Imports
- is_valid_syntax
from rfc3987_syntax import is_valid_syntax
- RFC3987_SYNTAX_TERMS
from rfc3987_syntax import RFC3987_SYNTAX_TERMS
Quickstart
from rfc3987_syntax import is_valid_syntax, RFC3987_SYNTAX_TERMS
# List all supported validation terms
print(f"Supported terms: {RFC3987_SYNTAX_TERMS}\n")
# Validate a string against a specific term
if is_valid_syntax(term='iri', value='http://example.com/path/to/resource?query=param#fragment'):
print("✓ 'http://example.com/path/to/resource?query=param#fragment' is a valid IRI syntax")
else:
print("✗ Invalid IRI syntax")
# Example of invalid syntax
if not is_valid_syntax(term='iri', value='not-an-iri-with- space'):
print("✗ 'not-an-iri-with- space' is invalid IRI syntax as expected")