pypandoc-binary: Pandoc Python Wrapper with Bundled Binary

1.17 · active · verified Thu Apr 09

pypandoc-binary is a thin Python wrapper for Pandoc that conveniently includes the Pandoc executable directly within its wheels, eliminating the need for users to manually install Pandoc. It is actively maintained with a somewhat irregular but consistent release cadence, with the current version being 1.17.

Warnings

breaking Python 2 support was officially dropped in `pypandoc` v1.8. Attempts to use it with Python 2.x will fail.
Fix: Upgrade to Python 3.x (3.7+ is currently required).
breaking Support for Python 3.6 was removed in `pypandoc` v1.13. Users on older Python 3.x versions may encounter installation or runtime errors.
Fix: Upgrade to Python 3.7 or newer. The current PyPI package requires >=3.7.
gotcha `pypandoc-binary` bundles the Pandoc executable. However, `pypandoc` (the base package without `-binary`) requires Pandoc to be pre-installed on the system. Users sometimes confuse these or switch between them, leading to 'Pandoc not found' errors.
Fix: Ensure you are using `pypandoc-binary` if you want Pandoc bundled, or verify Pandoc is in your PATH if using the `pypandoc` package.
gotcha Since `pypandoc` v1.7.0, the `sandbox` mode for Pandoc (versions >= 2.15) is enabled by default. This enhances security but might restrict access to local files or network resources during conversion, potentially breaking existing workflows that relied on it.
Fix: If conversions fail due to sandboxing, you can disable it by passing `extra_args=['--no-sandbox']` to `convert_text` or `convert_file`, but be aware of the security implications.
gotcha Newer versions of Pandoc (e.g., above 2.10), which are bundled with `pypandoc-binary`, do not include `citeproc` by default. If your conversions involve citations, you might need to install `pandoc-citeproc` separately or ensure it's available.
Fix: If citation processing fails, verify your Pandoc version and consider installing `pandoc-citeproc` or managing citation processing externally.

Install

pip install pypandoc-binary Install pypandoc-binary

Imports

convert_text

import pypandoc; pypandoc.convert_text(...)

convert_file

import pypandoc; pypandoc.convert_file(...)

download_pandoc
```
import pypandoc; pypandoc.download_pandoc()
```
All core functions are imported from `pypandoc`, even when installing `pypandoc-binary`.

Quickstart

This quickstart demonstrates converting Markdown text to reStructuredText and converting a Markdown file to an HTML file using `pypandoc-binary`. The `pypandoc` module is used for all operations, and the bundled Pandoc executable is utilized automatically.

import pypandoc
import os

# pypandoc-binary includes the pandoc executable in its wheel.
# No manual download is typically needed unless you want a different version.

markdown_text = "# Hello, World!\n\nThis is some **Markdown** content."

# Convert Markdown text to reStructuredText
try:
    rst_output = pypandoc.convert_text(markdown_text, 'rst', format='md')
    print("Converted to reStructuredText:\n" + rst_output)

    # Convert Markdown file to HTML file
    with open('input.md', 'w') as f:
        f.write(markdown_text)

    pypandoc.convert_file('input.md', 'html', outputfile='output.html')
    print("\nConverted input.md to output.html")
    with open('output.html', 'r') as f:
        print("\nContent of output.html:\n" + f.read()[:100] + '...') # Print first 100 chars

finally:
    # Clean up created files
    for f in ['input.md', 'output.html']:
        if os.path.exists(f):
            os.remove(f)

view raw JSON →