PyMuPDF-pro
raw JSON → 1.27.2.2 verified Thu Apr 16 auth: no python
PyMuPDF-pro is a commercial extension for the open-source PyMuPDF library. It enables robust handling of Office documents (e.g., .doc, .docx, .ppt, .pptx, .xls, .xlsx) and other formats not natively supported by PyMuPDF. It facilitates text and table extraction, document conversion to PDF, and more. The current version is 1.27.2.2, with releases typically aligning with PyMuPDF updates and feature enhancements.
pip install PyMuPDF-pro Common errors
error AttributeError: module 'fitz.utils' has no attribute 'set_license' or AttributeError: module 'fitz' has no attribute 'open_office_document' ↓
cause PyMuPDF-pro might not be correctly installed, or `PyMuPDF` (which provides the `fitz` module) is missing, preventing PyMuPDF-pro from patching the `fitz` module.
fix
Ensure both PyMuPDF and PyMuPDF-pro are installed:
pip install PyMuPDF PyMuPDF-pro. Verify that import fitz is executed before attempting to use PyMuPDF-pro specific functions. error RuntimeError: License key expired or not valid ↓
cause The license key provided to `fitz.utils.set_license()` is either incorrect, expired, or has not been properly registered with Artifex Software.
fix
Verify your PyMuPDF-pro license key. Contact Artifex Software support if you suspect the key is invalid or expired. Ensure the key is correctly passed to
fitz.utils.set_license(). error fitz.EmptyFilename: cannot open path/to/nonexistent.docx ↓
cause The file path provided to `fitz.open_office_document()` does not point to an existing file, or the path is inaccessible/incorrect.
fix
Double-check the file path. Ensure the file exists at the specified location and that your application has read permissions for it. Use absolute paths to avoid ambiguity.
Warnings
breaking PyMuPDF-pro requires a valid commercial license key to unlock its functionality. Without a key set via `fitz.utils.set_license()`, most operations on Office documents will fail with a license error. ↓
fix Obtain a commercial license key from Artifex Software and ensure it is correctly passed to `fitz.utils.set_license()` at the start of your application.
gotcha PyMuPDF-pro extends the `fitz` module of PyMuPDF. Therefore, PyMuPDF must be installed alongside PyMuPDF-pro for the extensions to be active and for functions like `fitz.open_office_document` to be available. ↓
fix Ensure both `PyMuPDF` and `PyMuPDF-pro` are installed: `pip install PyMuPDF PyMuPDF-pro`.
gotcha PyMuPDF-pro supports Python versions 3.9 and higher. Using it with older Python versions will result in installation or runtime errors. ↓
fix Upgrade your Python environment to version 3.9 or newer.
gotcha Processing large or complex Office documents (especially conversion or detailed extraction) can be resource-intensive, requiring significant CPU and memory. Performance may vary based on document complexity and system resources. ↓
fix Monitor resource usage for large documents and consider processing documents in batches or on systems with ample resources. Optimize document structures if possible.
Imports
- fitz
import fitz - fitz.utils.set_license
import fitz - fitz.open_office_document
import fitz
Quickstart
import os
import fitz # PyMuPDF-pro extends PyMuPDF's 'fitz' module
from pathlib import Path
# --- IMPORTANT: License Key Setup ---
# PyMuPDF-pro requires a commercial license key.
# Obtain your key from Artifex Software and set it as an environment variable,
# or replace 'YOUR_ACTUAL_LICENSE_KEY_HERE'.
license_key = os.environ.get('PYMUPDFPRO_LICENSE', 'YOUR_ACTUAL_LICENSE_KEY_HERE')
if license_key == 'YOUR_ACTUAL_LICENSE_KEY_HERE':
print("WARNING: Please set the 'PYMUPDFPRO_LICENSE' environment variable or replace the placeholder.")
print("Without a valid license, PyMuPDF-pro features will not function correctly.")
else:
try:
fitz.utils.set_license(license_key)
print("PyMuPDF-pro license key setup attempted.")
except AttributeError:
print("ERROR: 'fitz.utils.set_license' not found. Is PyMuPDF-pro installed?")
exit(1)
except Exception as e:
print(f"ERROR: Failed to set PyMuPDF-pro license: {e}")
exit(1)
# --- Example: Convert a DOCX file to PDF ---
# Replace 'path/to/your/document.docx' with an actual Office file path.
# You can use any supported format like .doc, .docx, .ppt, .pptx, .xls, .xlsx.
input_file = Path(os.environ.get('PYMUPDFPRO_INPUT_FILE', 'path/to/your/document.docx'))
output_file = Path("output.pdf")
if not input_file.exists() or 'path/to/your/document.docx' in str(input_file):
print(f"\nWARNING: Input file '{input_file}' not found or is a placeholder.")
print("Please provide a valid path to an Office document (e.g., .docx) for conversion.")
# For a truly runnable example without manual setup, one might create a dummy.
# For this quickstart, we'll indicate failure if the file isn't provided.
exit(1)
try:
# Use fitz.open_office_document (functionality added by PyMuPDF-pro)
doc = fitz.open_office_document(str(input_file))
doc.save(str(output_file)) # Save as PDF (default format)
doc.close()
print(f"\nSuccessfully converted '{input_file.name}' to '{output_file.name}'.")
except fitz.EmptyFilename:
print(f"ERROR: Input file path is invalid or empty: '{input_file}'.")
exit(1)
except Exception as e:
print(f"An error occurred during conversion: {e}")
# Common error here if license is invalid: 'RuntimeError: License key expired or not valid'
exit(1)
finally:
# Clean up the output file for a clean runnable example
if output_file.exists():
os.remove(output_file)
print(f"Cleaned up output file: {output_file}")