tinyhtml5

2.1.0 · active · verified Mon Apr 06

tinyhtml5 is a HTML5 parser, currently at version 2.1.0, that transforms a possibly malformed HTML document into an ElementTree tree. It is a simplified and modernized fork of the unmaintained `html5lib` library, focusing solely on parsing and generating `ElementTree` output. It typically releases updates to support new Python versions and minor feature enhancements.

Warnings

Install

Imports

Quickstart

Parses an HTML string into an ElementTree object.

from tinyhtml5 import parse

html_string = '<html><body><p>Hello, tinyhtml5!</p></body></html>'
parsed_tree = parse(html_string)

# The parsed_tree is an ElementTree object
print(parsed_tree)
print(parsed_tree.tag)
print(parsed_tree[0].tag)
print(parsed_tree[0][0].text)

view raw JSON →