{"library":"tinyhtml5","title":"tinyhtml5","description":"tinyhtml5 is a HTML5 parser, currently at version 2.1.0, that transforms a possibly malformed HTML document into an ElementTree tree. It is a simplified and modernized fork of the unmaintained `html5lib` library, focusing solely on parsing and generating `ElementTree` output. It typically releases updates to support new Python versions and minor feature enhancements.","status":"active","version":"2.1.0","language":"en","source_language":"en","source_url":"https://github.com/CourtBouillon/tinyhtml5","tags":["html","parser","html5","fork","html5lib","elementtree"],"install":[{"cmd":"pip install tinyhtml5","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Required for character encoding detection.","package":"webencodings","optional":false}],"imports":[{"note":"tinyhtml5 is a fork of html5lib with a simplified API. Directly importing 'parse' from html5lib will use the original library, not tinyhtml5.","wrong":"from html5lib import parse","symbol":"parse","correct":"from tinyhtml5 import parse"}],"quickstart":{"code":"from tinyhtml5 import parse\n\nhtml_string = '<html><body><p>Hello, tinyhtml5!</p></body></html>'\nparsed_tree = parse(html_string)\n\n# The parsed_tree is an ElementTree object\nprint(parsed_tree)\nprint(parsed_tree.tag)\nprint(parsed_tree[0].tag)\nprint(parsed_tree[0][0].text)","lang":"python","description":"Parses an HTML string into an ElementTree object."},"warnings":[{"fix":"Upgrade to Python 3.10+ or pin tinyhtml5 to '<2.1.0' if Python 3.9 is required.","message":"Python 3.9 support was dropped in tinyhtml5 2.1.0. Python 3.10 or newer is now required. Users on older Python versions will need to use tinyhtml5 2.0.0 or earlier.","severity":"breaking","affected_versions":">=2.1.0"},{"fix":"Carefully review the 'Going Further' documentation, especially the 'What are the differences with html5lib?' section, if migrating from `html5lib`. Adapt code to work solely with `ElementTree` for tree manipulation.","message":"tinyhtml5 is a simplified fork of `html5lib`. It only exposes a single `tinyhtml5.parse()` function that returns an `ElementTree` object. Many features present in `html5lib`, such as tree walkers, adapters, filters, and alternative tree builders (e.g., DOM, BeautifulSoup), are not supported in tinyhtml5.","severity":"breaking","affected_versions":">=2.0.0b1"},{"fix":"Familiarize yourself with the `xml.etree.ElementTree` API for navigating and manipulating the parsed HTML document, or explicitly convert the `ElementTree` to your preferred format/library.","message":"The only output format supported by `tinyhtml5` is ElementTree. If you are accustomed to working with other HTML parsing libraries like `BeautifulSoup` or `lxml` directly, you will receive an `ElementTree` object and may need to convert it or use `ElementTree`'s API for further processing.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-06T00:00:00.000Z","next_check":"2026-07-05T00:00:00.000Z"}