{"id":7290,"library":"html-to-json","title":"HTML to JSON Converter","description":"The `html-to-json` Python library, currently at version 2.0.0, provides functionality to convert HTML strings into a JSON representation. It also includes intelligent conversion for HTML tables. The project is currently in maintenance mode, with the author seeking sponsorship for active development and ongoing upkeep.","status":"maintenance","version":"2.0.0","language":"en","source_language":"en","source_url":"https://github.com/fhightower/html-to-json","tags":["html","json","parser","conversion","html-parsing"],"install":[{"cmd":"pip install html-to-json","lang":"bash","label":"Install stable release"}],"dependencies":[],"imports":[{"symbol":"html_to_json","correct":"import html_to_json"}],"quickstart":{"code":"import html_to_json\n\nhtml_string = \"\"\"<head>\n    <title>Test site</title>\n    <meta charset=\"UTF-8\">\n    <p>This is some <b>bold</b> text.</p>\n    <table>\n        <thead>\n            <tr><th>Header 1</th><th>Header 2</th></tr>\n        </thead>\n        <tbody>\n            <tr><td>Data 1</td><td>Data 2</td></tr>\n        </tbody>\n    </table>\n</head>\"\"\"\n\n# Convert HTML to JSON\noutput_json = html_to_json.convert(html_string)\nprint(output_json)\n\n# Convert HTML to JSON without capturing element values\noutput_no_values = html_to_json.convert(html_string, capture_element_values=False)\nprint(output_no_values)\n\n# Convert HTML to JSON without capturing element attributes\noutput_no_attributes = html_to_json.convert(html_string, capture_element_attributes=False)\nprint(output_no_attributes)","lang":"python","description":"This quickstart demonstrates how to convert a basic HTML string into JSON using the `html_to_json.convert` function. It also shows how to use the `capture_element_values` and `capture_element_attributes` parameters to control the output format."},"warnings":[{"fix":"Consider contributing to the project or sponsoring the author for continued development.","message":"The library is currently in a maintenance-only state. The author has indicated that active development requires sponsorship. Users should be aware that new features or rapid bug fixes may not be prioritized without community support.","severity":"maintenance","affected_versions":"2.0.0+"},{"fix":"Review your `html_to_json.convert()` calls and explicitly set `capture_element_values` or `capture_element_attributes` if precise control over the JSON output's value or attribute inclusion is required.","message":"When upgrading from versions prior to 2.0.0, new parameters `capture_element_values` and `capture_element_attributes` were introduced to the `convert` function. While they default to `True`, explicitly setting them might be necessary to ensure consistent output if your downstream code relies on a specific JSON structure.","severity":"gotcha","affected_versions":"1.x.x to 2.0.0"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Refer to the official GitHub README for `fhightower/html-to-json` version 2.0.0 for the correct function signature and available keyword arguments (e.g., `capture_element_values`, `capture_element_attributes`).","cause":"Attempting to use parameters that existed in older, potentially non-fhightower/html-to-json versions, or parameters that have been removed/renamed in version 2.0.0.","error":"TypeError: convert() got an unexpected keyword argument 'some_old_param'"},{"fix":"Understand that the library's JSON output for elements typically includes nested dictionaries with keys like `_value` for text and `_attributes` for HTML attributes. Adjust your JSON parsing logic accordingly. For example, `output_json['head'][0]['title'][0]['_value']` to access the title text.","cause":"Expecting a flattened or different JSON structure, or not accounting for the library's specific output format which uses `_value` for text content and `_attributes` for element attributes.","error":"KeyError: '_value' or KeyError: '_attributes' in output JSON"},{"fix":"Ensure the input HTML is well-formed. Use a HTML validator if the source HTML is external or untrusted. Inspect the raw output of `html_to_json.convert()` before attempting `json.loads()` to identify any intermediate parsing issues.","cause":"This error occurs when trying to parse the output of `html_to_json.convert` using `json.loads()` and the output is not valid JSON. This typically means the `html_to_json` library encountered highly malformed HTML that it couldn't convert into a well-formed JSON structure, or the conversion function itself returned an error or unexpected string.","error":"json.decoder.JSONDecodeError: Expecting value: line X column Y (char Z)"}]}