Typing Stubs for html5lib
This package provides static type checking stubs for the `html5lib` library, allowing type checkers like MyPy to validate usage of `html5lib` APIs. It's part of the `typeshed` project, which continuously maintains and releases type stubs for many popular Python libraries. New versions are released frequently, often coinciding with upstream library updates or Python version changes.
Warnings
- gotcha Type stubs are for static analysis only. Installing `types-html5lib` does not install the actual `html5lib` library, nor does it make `html5lib` imports runnable if `html5lib` is not installed. Both packages are typically needed: `html5lib` for runtime execution and `types-html5lib` for type checking.
- gotcha Version mismatches between the `types-html5lib` stubs and the actual `html5lib` library can lead to incorrect or missing type hints. Stubs are developed against specific versions of the underlying library.
- breaking Significant API changes in the `html5lib` library itself will be reflected in subsequent `types-html5lib` releases. This can cause type-checking errors (e.g., 'missing attribute', 'unexpected argument') in code that was previously valid against older stubs.
- deprecated Stubs for older Python versions or very old `html5lib` versions may eventually be dropped from `typeshed`. This means `types-html5lib` might cease to provide support for specific legacy environments.
Install
-
pip install types-html5lib
Imports
- html5lib
import html5lib
- html5lib.HTMLParser
from html5lib import HTMLParser
Quickstart
import html5lib
from html5lib.treebuilders import getTreeBuilder
import io
from typing import TextIO
# The types-html5lib package provides static type checking for html5lib.
# You use html5lib normally, and a type checker (like mypy) will use the stubs
# to validate your code.
# Example: Parsing HTML from a string
html_string: str = "<html><body><h1>Hello World</h1><p>This is a test.</p></body></html>"
# Initialize the parser with a tree builder (e.g., 'dom' for a DOM-like tree)
parser = html5lib.HTMLParser(tree=getTreeBuilder("dom"))
# Parse the HTML string
document = parser.parse(html_string)
# A type checker using types-html5lib would know the methods available on 'document'
# based on the 'dom' tree builder. For example, for a DOM tree:
if hasattr(document, 'getElementsByTagName'):
h1_elements = document.getElementsByTagName('h1')
if h1_elements:
print(f"First H1 tag content: {h1_elements[0].firstChild.nodeValue}")
# Example: Parsing from a file-like object
html_file_like: TextIO = io.StringIO("<html><head><title>Registry Entry</title></head><body></body></html>")
document_from_file = parser.parse(html_file_like)
if hasattr(document_from_file, 'getElementsByTagName'):
title_elements = document_from_file.getElementsByTagName('title')
if title_elements:
print(f"Document title from file: {title_elements[0].firstChild.nodeValue}")
print("html5lib parsing completed with type-checking support provided by types-html5lib.")