docxcompose
docxcompose is a Python library for concatenating/appending Microsoft Word (.docx) files. It extends `python-docx` to facilitate merging documents while preserving complex formatting, styles, headers, and footers. The current version is 2.1.0, with a release cadence of several updates per year.
Warnings
- gotcha Headers and footers from appended documents are ignored. Only the headers and footers of the *first* (master) document are used in the merged file.
- gotcha Merging documents with complex elements like SmartArt or custom properties can result in corrupted output documents. This is an open issue.
- breaking Despite PyPI indicating compatibility with Python >=3.10, some users have reported that recent versions (possibly 2.x) unexpectedly require Python 3.12 or newer.
- gotcha For applications bundled with PyInstaller, you may need to explicitly collect `docxcompose` data to avoid runtime errors (e.g., `pyinstaller --collect-data "docxcompose" your_script.py`).
- gotcha Processing very large documents can be slow and consume moderate memory, as documents are processed sequentially and stored in memory.
Install
-
pip install docxcompose
Imports
- Composer
from docxcompose.composer import Composer
- Document
from docx import Document
Quickstart
from docxcompose.composer import Composer
from docx import Document
import os
# Create dummy master and sub documents for demonstration
# In a real scenario, these would be existing .docx files
# Master document
master = Document()
master.add_heading('Master Document Title', level=0)
master.add_paragraph('This is the content of the master document.')
master.add_paragraph('It may contain its own headers and footers.')
master.save('master.docx')
# Sub document 1
doc1 = Document()
doc1.add_heading('Section 1', level=1)
doc1.add_paragraph('Content from the first appended document.')
doc1.save('doc1.docx')
# Sub document 2
doc2 = Document()
doc2.add_heading('Section 2', level=1)
doc2.add_paragraph('More content from the second appended document.')
doc2.save('doc2.docx')
# Compose documents
master_doc = Document('master.docx')
composer = Composer(master_doc)
# Append documents
composer.append(Document('doc1.docx'))
composer.append(Document('doc2.docx'))
# Save the combined document
composer.save('combined.docx')
print("Documents composed and saved to 'combined.docx'")
# Clean up dummy files (optional)
os.remove('master.docx')
os.remove('doc1.docx')
os.remove('doc2.docx')