HTML to Markdown Converter

3.1.0 · active · verified Tue Apr 14

html-to-markdown is a high-performance Python library for converting HTML to Markdown, powered by a Rust core. Currently at version 3.1.0, it offers a clean Python API and aims for consistent output across multiple language bindings. The library is actively maintained with ongoing development and performance enhancements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates basic HTML to Markdown conversion using the `convert` function. It also shows how to apply `ConversionOptions` to customize the output format, such as specifying heading styles or using Djot instead of standard Markdown.

from html_to_markdown import convert, ConversionOptions

html_content = """
<h1>Welcome</h1>
<p>This is <strong>bold</strong> and <em>italic</em> text.</p>
<ul>
    <li>Item 1</li>
    <li>Item 2</li>
</ul>
"""

# Basic conversion
markdown_output = convert(html_content)
print(f"Default Markdown:\n{markdown_output}")

# Conversion with options
options = ConversionOptions(
    heading_style="atx",
    list_indent_width=2,
    output_format="commonmark"
)
formatted_markdown = convert(html_content, options)
print(f"\nFormatted Markdown (CommonMark):\n{formatted_markdown}")

# Example for Djot output (another lightweight markup language)
djot_options = ConversionOptions(output_format="djot")
djot_output = convert(html_content, djot_options)
print(f"\nDjot Output:\n{djot_output}")

view raw JSON →