HTML to JSON Converter

2.0.0 · maintenance · verified Thu Apr 16

The `html-to-json` Python library, currently at version 2.0.0, provides functionality to convert HTML strings into a JSON representation. It also includes intelligent conversion for HTML tables. The project is currently in maintenance mode, with the author seeking sponsorship for active development and ongoing upkeep.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to convert a basic HTML string into JSON using the `html_to_json.convert` function. It also shows how to use the `capture_element_values` and `capture_element_attributes` parameters to control the output format.

import html_to_json

html_string = """<head>
    <title>Test site</title>
    <meta charset="UTF-8">
    <p>This is some <b>bold</b> text.</p>
    <table>
        <thead>
            <tr><th>Header 1</th><th>Header 2</th></tr>
        </thead>
        <tbody>
            <tr><td>Data 1</td><td>Data 2</td></tr>
        </tbody>
    </table>
</head>"""

# Convert HTML to JSON
output_json = html_to_json.convert(html_string)
print(output_json)

# Convert HTML to JSON without capturing element values
output_no_values = html_to_json.convert(html_string, capture_element_values=False)
print(output_no_values)

# Convert HTML to JSON without capturing element attributes
output_no_attributes = html_to_json.convert(html_string, capture_element_attributes=False)
print(output_no_attributes)

view raw JSON →