xhtml2pdf
xhtml2pdf is an active Python library that converts HTML and CSS into PDF documents, leveraging the ReportLab Toolkit, html5lib, and pypdf. It supports HTML5 and CSS 2.1 (with some CSS 3), offering a platform-independent solution for generating PDFs from web content. The project maintains a regular release cadence, with the current version being 0.2.17.
Warnings
- breaking Python 2 support was officially dropped in version 0.2.6. As of version 0.2.12, Python 3.7 is no longer supported either. The library now requires Python 3.8 or newer.
- deprecated The direct imports `XML2PDF` and `XHTML2PDF` (e.g., `from xhtml2pdf import XML2PDF`) were deprecated in v0.2.7 and are scheduled for removal in future versions. While `HTML2PDF` was mentioned as an alternative, the primary and recommended entry point for conversion is `pisa.CreatePDF`.
- gotcha Compatibility with `ReportLab` versions is a recurring issue. Specifically, `xhtml2pdf 0.2.16` added compatibility for `reportlab >= 4.1`. Earlier versions like `0.2.12` and `0.2.11` had specific dependencies on `reportlab >= 4.0.4` and `reportlab >=3.5.53,<4` respectively. Mismatched versions can lead to unexpected errors.
- breaking The underlying PDF library dependency for merging and other operations has changed. `PyPDF2` was replaced with `PyPDF3` in some earlier 0.2.x releases, and then `PyPDF3` was changed to `pypdf` in version 0.2.9.
Install
-
pip install xhtml2pdf -
pip install xhtml2pdf[pycairo]
Imports
- pisa
from xhtml2pdf import pisa
Quickstart
from io import BytesIO
from xhtml2pdf import pisa
html_content = '''
<html>
<head>
<style>
@page { size: A4 portrait; margin: 1cm; }
h1 { color: #333; }
p { font-family: sans-serif; }
</style>
</head>
<body>
<h1>Hello from xhtml2pdf!</h1>
<p>This is a simple HTML to PDF conversion example.</p>
<p>Visit <a href="https://github.com/xhtml2pdf/xhtml2pdf">xhtml2pdf on GitHub</a>.</p>
</body>
</html>
'''
def convert_html_to_pdf(source_html, output_filename):
result_file = open(output_filename, 'w+b') # Open in binary write mode
pisa_status = pisa.CreatePDF(
source_html, # the HTML to convert
dest=result_file) # file handle to receive result
result_file.close() # close output file
if pisa_status.err:
print(f"PDF creation failed with errors: {pisa_status.err}")
return False
else:
print(f"PDF created successfully: {output_filename}")
return True
# Example usage:
convert_html_to_pdf(html_content, "example.pdf")