img2pdf: Lossless Image to PDF Conversion
img2pdf is a Python library designed for lossless conversion of raster images to PDF. It excels at embedding JPEG and JPEG2000 files without re-encoding, preserving original quality and minimizing file size, treating PDF primarily as a container format. For other image formats, it uses lossless zip compression. The current version is 0.6.3, and the library maintains an active release cadence with regular updates.
Warnings
- gotcha Pillow's decompression bomb limit can prevent processing of very large images. img2pdf uses Pillow for metadata and conversion of non-JPEG/JPEG2000 images.
- gotcha JPEG images with an invalid EXIF Orientation value of zero (often from Android phones or Canon DSLR cameras) can cause errors.
- gotcha When converting multiple images from a directory, `os.listdir()` or `os.scandir()` return files in arbitrary order, which can lead to incorrect page sequencing in the final PDF. Ensure images are explicitly sorted before passing to `img2pdf.convert()`.
- gotcha Certain TIFF compression methods, especially 'Old-Style JPEG (in tiff)', might not be supported, leading to conversion failures.
- gotcha In the command-line usage, the output file argument (`-o`) must generally precede the input image files. Incorrect order can lead to unexpected behavior or errors.
Install
-
pip install img2pdf
Imports
- img2pdf
import img2pdf
Quickstart
import img2pdf
import os
# Create a dummy image file for demonstration
dummy_image_data = bytes([0xFF, 0xD8, 0xFF, 0xE0, 0x00, 0x10, 0x4A, 0x46, 0x49, 0x46, 0x00, 0x01, 0x01, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0xFF, 0xDB, 0x00, 0x43, 0x00, 0x03, 0x02, 0x02, 0x02, 0x02, 0x02, 0x03, 0x02, 0x02, 0x02, 0x03, 0x03, 0x03, 0x03, 0x04, 0x06, 0x04, 0x04, 0x04, 0x04, 0x04, 0x08, 0x06, 0x06, 0x05, 0x06, 0x09, 0x08, 0x0A, 0x0A, 0x09, 0x08, 0x09, 0x09, 0x0A, 0x0C, 0x0F, 0x0C, 0x0A, 0x0B, 0x0E, 0x0B, 0x09, 0x09, 0x0D, 0x11, 0x0D, 0x0E, 0x0F, 0x10, 0x10, 0x11, 0x10, 0x0A, 0x0C, 0x12, 0x13, 0x12, 0x10, 0x13, 0x11, 0x11, 0x10, 0xFF, 0xC0, 0x00, 0x11, 0x08, 0x00, 0x01, 0x00, 0x01, 0x03, 0x01, 0x22, 0x00, 0x02, 0x11, 0x01, 0x03, 0x11, 0x01, 0xFF, 0xC4, 0x00, 0x1F, 0x00, 0x00, 0x01, 0x05, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0xFF, 0xC4, 0x00, 0x1F, 0x01, 0x00, 0x03, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0xFF, 0xC4, 0x00, 0x1F, 0x10, 0x00, 0x01, 0x03, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0xFF, 0xC4, 0x00, 0x1F, 0x11, 0x00, 0x03, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0xFF, 0xDA, 0x00, 0x0C, 0x03, 0x01, 0x00, 0x02, 0x11, 0x03, 0x11, 0x00, 0x3F, 0x00, 0xF2, 0xCA, 0x88, 0x64, 0x02, 0xFF, 0xD9])
with open('temp_image.jpg', 'wb') as f:
f.write(dummy_image_data)
# --- Quickstart ---
# Convert a single image file to PDF
with open('output_single.pdf', 'wb') as f_pdf:
f_pdf.write(img2pdf.convert('temp_image.jpg'))
print('Successfully converted single image to output_single.pdf')
# Convert multiple images to a multi-page PDF
# (for demonstration, we'll just use the same dummy image twice)
image_files = ['temp_image.jpg', 'temp_image.jpg']
with open('output_multiple.pdf', 'wb') as f_pdf:
f_pdf.write(img2pdf.convert(image_files))
print('Successfully converted multiple images to output_multiple.pdf')
# Clean up dummy image
os.remove('temp_image.jpg')