pikepdf
pikepdf is a Python library that enables reading, writing, repairing, and transforming PDF files, leveraging the powerful `qpdf` C++ library. It offers comprehensive PDF manipulation capabilities, including merging, splitting, compressing, and extracting data. Currently at version 10.5.1, it follows a regular release cadence with several minor and patch releases per year.
Warnings
- gotcha pikepdf is a Python wrapper around the `qpdf` C++ library. You must have `qpdf` installed on your system for pikepdf to function. It is not installed via pip.
- breaking The `Pdf.read()` method was removed in favor of `Pdf.open()`. Always use `Pdf.open()` and it is strongly recommended to use it as a context manager (`with Pdf.open(...) as pdf:`).
- breaking Attributes like `page.trimbox` and `page.cropbox` now return `PageBox` objects instead of simple tuples or lists of numbers. These `PageBox` objects have methods like `pagebox.as_list()` or `pagebox.as_tuple()` to get the raw coordinates.
- gotcha Changes made to a `Pdf` object are not saved until `pdf.save()` is explicitly called. If you modify a PDF and forget to call `save()`, your changes will be lost.
Install
-
pip install pikepdf
Imports
- Pdf
from pikepdf import Pdf
- Page
from pikepdf import Page
Quickstart
from pikepdf import Pdf, Page
import os
# Create a new PDF and add a blank page
output_filename = "my_first_pikepdf.pdf"
with Pdf.new() as pdf:
pdf.add_blank_page()
pdf.save(output_filename)
print(f"Created {output_filename} with one blank page.")
# Open an existing PDF, add another page, and save it to a new file
modified_filename = "my_modified_pikepdf.pdf"
with Pdf.open(output_filename) as pdf:
pdf.add_blank_page()
print(f"PDF now has {len(pdf.pages)} pages.")
pdf.save(modified_filename)
print(f"Modified PDF saved to {modified_filename}.")
# Clean up generated files (optional)
# os.remove(output_filename)
# os.remove(modified_filename)