{"id":1707,"library":"sgmllib3k","title":"sgmllib3k","description":"sgmllib3k is a Python 3 port of the `sgmllib` module, which was deprecated in Python 2.6 and removed in Python 3.0. It provides a basic SGML/HTML parser for legacy applications. The current version is 1.0.0, released in 2011, and the project appears to be abandoned.","status":"abandoned","version":"1.0.0","language":"en","source_language":"en","source_url":"https://github.com/dound/sgmllib3k","tags":["html parsing","sgml","legacy","abandoned","python3"],"install":[{"cmd":"pip install sgmllib3k","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"note":"`sgmllib` was removed from the Python standard library in Python 3.0.","wrong":"from sgmllib import SGMLParser","symbol":"SGMLParser","correct":"from sgmllib3k import SGMLParser"}],"quickstart":{"code":"import sgmllib3k\n\nclass MyParser(sgmllib3k.SGMLParser):\n    def __init__(self, verbose=0):\n        sgmllib3k.SGMLParser.__init__(self, verbose)\n        self.data = []\n\n    def handle_data(self, data):\n        self.data.append(data)\n\n    def unknown_starttag(self, tag, attrs):\n        # Example: print all start tags\n        pass\n\n    def unknown_endtag(self, tag):\n        # Example: print all end tags\n        pass\n\nhtml_content = \"<html><body><h1>Hello</h1><p>World</p></body></html>\"\nparser = MyParser()\nparser.feed(html_content)\nparser.close()\n\nprint(\"Extracted data:\", parser.data)\n# Expected output: Extracted data: ['Hello', 'World']","lang":"python","description":"This quickstart demonstrates how to create a basic parser by subclassing `SGMLParser` and overriding methods like `handle_data` to process the parsed content. The `feed()` method is used to pass the HTML string to the parser."},"warnings":[{"fix":"For new development or modern web parsing, consider the standard library's `html.parser` or third-party libraries like `BeautifulSoup`.","message":"sgmllib3k is not actively maintained and was last updated in 2011. It is designed for early Python 3 versions (e.g., 3.0-3.2) and may not be compatible or stable with modern Python 3.x releases (3.6+).","severity":"breaking","affected_versions":"<=1.0.0"},{"fix":"Migrate to `html.parser` for basic HTML parsing or `BeautifulSoup` for more comprehensive and fault-tolerant parsing of real-world HTML documents.","message":"This library itself is a port of a deprecated module (`sgmllib`). It lacks modern features like HTML5 support, robust error handling, and performance optimizations found in contemporary parsing libraries. Its use is strongly discouraged for new projects.","severity":"deprecated","affected_versions":"<=1.0.0"},{"fix":"Be prepared to write significant boilerplate code for custom parsing logic. Evaluate if `html.parser`'s `HTMLParser` or `BeautifulSoup`'s DOM-like interface would simplify your task considerably.","message":"Unlike `html.parser` or `BeautifulSoup`, `sgmllib3k` (and the original `sgmllib`) provides a very low-level, SAX-like parser. It requires manual implementation of handler methods (e.g., `start_tag`, `end_tag`, `handle_data`) which can be verbose and error-prone for complex parsing tasks.","severity":"gotcha","affected_versions":"<=1.0.0"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}