pypandoc-binary: Pandoc Python Wrapper with Bundled Binary
pypandoc-binary is a thin Python wrapper for Pandoc that conveniently includes the Pandoc executable directly within its wheels, eliminating the need for users to manually install Pandoc. It is actively maintained with a somewhat irregular but consistent release cadence, with the current version being 1.17.
Warnings
- breaking Python 2 support was officially dropped in `pypandoc` v1.8. Attempts to use it with Python 2.x will fail.
- breaking Support for Python 3.6 was removed in `pypandoc` v1.13. Users on older Python 3.x versions may encounter installation or runtime errors.
- gotcha `pypandoc-binary` bundles the Pandoc executable. However, `pypandoc` (the base package without `-binary`) requires Pandoc to be pre-installed on the system. Users sometimes confuse these or switch between them, leading to 'Pandoc not found' errors.
- gotcha Since `pypandoc` v1.7.0, the `sandbox` mode for Pandoc (versions >= 2.15) is enabled by default. This enhances security but might restrict access to local files or network resources during conversion, potentially breaking existing workflows that relied on it.
- gotcha Newer versions of Pandoc (e.g., above 2.10), which are bundled with `pypandoc-binary`, do not include `citeproc` by default. If your conversions involve citations, you might need to install `pandoc-citeproc` separately or ensure it's available.
Install
-
pip install pypandoc-binary
Imports
- convert_text
import pypandoc; pypandoc.convert_text(...)
- convert_file
import pypandoc; pypandoc.convert_file(...)
- download_pandoc
import pypandoc; pypandoc.download_pandoc()
Quickstart
import pypandoc
import os
# pypandoc-binary includes the pandoc executable in its wheel.
# No manual download is typically needed unless you want a different version.
markdown_text = "# Hello, World!\n\nThis is some **Markdown** content."
# Convert Markdown text to reStructuredText
try:
rst_output = pypandoc.convert_text(markdown_text, 'rst', format='md')
print("Converted to reStructuredText:\n" + rst_output)
# Convert Markdown file to HTML file
with open('input.md', 'w') as f:
f.write(markdown_text)
pypandoc.convert_file('input.md', 'html', outputfile='output.html')
print("\nConverted input.md to output.html")
with open('output.html', 'r') as f:
print("\nContent of output.html:\n" + f.read()[:100] + '...') # Print first 100 chars
finally:
# Clean up created files
for f in ['input.md', 'output.html']:
if os.path.exists(f):
os.remove(f)