python-docx-ml6: Microsoft Word .docx File Manipulation
python-docx-ml6 is a Python library for creating, reading, and updating Microsoft Word 2007+ (.docx) files. It is a fork from the original `python-docx` library, specifically including feature requests provided by the open-source community that have not yet been merged into the upstream project. The current version is 1.0.2, released in November 2023, indicating an active development and maintenance cadence.
Common errors
-
ModuleNotFoundError: No module named 'docx'
cause The package `python-docx-ml6` was installed, but the Python interpreter cannot find the `docx` module. This often happens if the package wasn't installed in the active environment, or if an older, unmaintained package named `docx` was accidentally installed instead of `python-docx-ml6`.fixFirst, ensure `python-docx-ml6` is installed in your active Python environment (`pip install python-docx-ml6`). Then, verify its presence (`pip show python-docx-ml6`). If you previously installed a package called `docx` (which is deprecated), uninstall it (`pip uninstall docx`) and reinstall `python-docx-ml6`. -
Deprecation Warning: This code will cease to work in future versions.
cause This warning typically originates from an underlying dependency like `lxml`, or from using an older version of `python-docx` (or its fork `python-docx-ml6`) that had known deprecation issues. For instance, specific `python-docx` versions before 0.8.11 had a known deprecation warning.fixUpgrade `python-docx-ml6` and its core dependencies to their latest stable versions: `pip install --upgrade python-docx-ml6 lxml`. Review the library's GitHub issues for any specific known deprecations in the current version. -
AttributeError: 'Document' object has no attribute 'some_feature'
cause The user might be expecting a feature that exists in the original `python-docx` or a different fork, or conversely, a feature that only exists in `python-docx-ml6` but is not present in the user's installed version or is called differently. This highlights the fork's specific feature set.fixConfirm the installed version of `python-docx-ml6` (`pip show python-docx-ml6`). Consult the `ml6team/python-docx` GitHub repository for the specific API and features supported by this fork. If the feature is expected from the original `python-docx`, consider if `python-docx-ml6` has a different implementation or if the feature is simply not included in the fork.
Warnings
- gotcha This library is a fork (`python-docx-ml6`) of the original `python-docx`. While it includes additional community-requested features, users migrating from the original `python-docx` or expecting identical behavior should verify specific functionalities, as there might be subtle differences or a divergent maintenance path.
- gotcha Saving a document to an existing filename will silently overwrite the original file without any warning or prompt.
- gotcha When replacing text or placeholders in a document, simple methods might only affect the first occurrence within a paragraph or run. Achieving global or multiple replacements often requires iterating through paragraphs and runs.
- breaking If migrating from very old versions of `python-docx` (0.2.x or earlier), be aware that versions 0.3.0 and later introduced significant API incompatibilities. While `python-docx-ml6` is a fork of a modern `python-docx`, this historical context is relevant for deep migrations.
Install
-
pip install python-docx-ml6
Imports
- Document
import docx
from docx import Document
Quickstart
from docx import Document
from docx.shared import Inches
# Create a new document
document = Document()
# Add a heading
document.add_heading('Document Title', level=0)
# Add a paragraph
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
# Add a heading with level 1
document.add_heading('Heading, level 1', level=1)
# Add a picture
document.add_picture('path/to/image.png', width=Inches(1.25))
# Add a table
records = (
(3, '101', 'Spam'),
(7, '422', 'Eggs'),
(4, '631', 'Spam, eggs, and bacon'),
)
table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for qty, id, desc in records:
row_cells = table.add_row().cells
row_cells[0].text = str(qty)
row_cells[1].text = id
row_cells[2].text = desc
document.add_page_break()
# Save the document
document.save('demo.docx')
print("Document 'demo.docx' created successfully.")