PyMuPDF-pro

1.27.2.2 · active · verified Thu Apr 16

PyMuPDF-pro is a commercial extension for the open-source PyMuPDF library. It enables robust handling of Office documents (e.g., .doc, .docx, .ppt, .pptx, .xls, .xlsx) and other formats not natively supported by PyMuPDF. It facilitates text and table extraction, document conversion to PDF, and more. The current version is 1.27.2.2, with releases typically aligning with PyMuPDF updates and feature enhancements.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use PyMuPDF-pro to convert an Office document (like DOCX) to PDF. It highlights the critical step of setting your commercial license key and then utilizes the `fitz.open_office_document` function provided by the extension. Remember to replace the placeholder path with an actual Office document.

import os
import fitz # PyMuPDF-pro extends PyMuPDF's 'fitz' module
from pathlib import Path

# --- IMPORTANT: License Key Setup ---
# PyMuPDF-pro requires a commercial license key.
# Obtain your key from Artifex Software and set it as an environment variable,
# or replace 'YOUR_ACTUAL_LICENSE_KEY_HERE'.
license_key = os.environ.get('PYMUPDFPRO_LICENSE', 'YOUR_ACTUAL_LICENSE_KEY_HERE')

if license_key == 'YOUR_ACTUAL_LICENSE_KEY_HERE':
    print("WARNING: Please set the 'PYMUPDFPRO_LICENSE' environment variable or replace the placeholder.")
    print("Without a valid license, PyMuPDF-pro features will not function correctly.")
else:
    try:
        fitz.utils.set_license(license_key)
        print("PyMuPDF-pro license key setup attempted.")
    except AttributeError:
        print("ERROR: 'fitz.utils.set_license' not found. Is PyMuPDF-pro installed?")
        exit(1)
    except Exception as e:
        print(f"ERROR: Failed to set PyMuPDF-pro license: {e}")
        exit(1)

# --- Example: Convert a DOCX file to PDF ---
# Replace 'path/to/your/document.docx' with an actual Office file path.
# You can use any supported format like .doc, .docx, .ppt, .pptx, .xls, .xlsx.
input_file = Path(os.environ.get('PYMUPDFPRO_INPUT_FILE', 'path/to/your/document.docx'))
output_file = Path("output.pdf")

if not input_file.exists() or 'path/to/your/document.docx' in str(input_file):
    print(f"\nWARNING: Input file '{input_file}' not found or is a placeholder.")
    print("Please provide a valid path to an Office document (e.g., .docx) for conversion.")
    # For a truly runnable example without manual setup, one might create a dummy.
    # For this quickstart, we'll indicate failure if the file isn't provided.
    exit(1)

try:
    # Use fitz.open_office_document (functionality added by PyMuPDF-pro)
    doc = fitz.open_office_document(str(input_file))
    doc.save(str(output_file)) # Save as PDF (default format)
    doc.close()
    print(f"\nSuccessfully converted '{input_file.name}' to '{output_file.name}'.")

except fitz.EmptyFilename:
    print(f"ERROR: Input file path is invalid or empty: '{input_file}'.")
    exit(1)
except Exception as e:
    print(f"An error occurred during conversion: {e}")
    # Common error here if license is invalid: 'RuntimeError: License key expired or not valid'
    exit(1)
finally:
    # Clean up the output file for a clean runnable example
    if output_file.exists():
        os.remove(output_file)
        print(f"Cleaned up output file: {output_file}")

view raw JSON →