Markdownify
Markdownify is a Python library designed to convert HTML content into Markdown format. It is currently at version 1.2.2 and maintains an active development status with a healthy release cadence, frequently adding new features and addressing issues.
Warnings
- breaking The interface for custom tag conversion functions (e.g., `convert_*()`) changed significantly in version 1.0.0. If you have custom conversion logic, it will need to be updated.
- gotcha The `strip` and `convert` options for `markdownify` are mutually exclusive. You cannot use both simultaneously.
- gotcha When customizing BeautifulSoup parser options via the `beautiful_soup_parser` argument (added in v1.2.0), string or list values are treated as 'features' (e.g., 'lxml', 'html5lib'), while dictionary values are treated as full keyword arguments for the BeautifulSoup constructor.
- gotcha By default, `markdownify` escapes asterisks (`*`) and underscores (`_`) that might be interpreted as Markdown formatting. If you want to disable this behavior, you need to explicitly set `escape_asterisks=False` or `escape_underscores=False`.
Install
-
pip install markdownify
Imports
- markdownify
from markdownify import markdownify as md
Quickstart
from markdownify import markdownify as md html_content = "<h1>Hello World</h1><p>This is <b>bold</b> and <em>italic</em> text with a <a href=\"http://example.com\">link</a>.</p>" markdown_output = md(html_content) print(markdown_output)