{"id":5793,"library":"pandoc","title":"Pandoc Documents for Python","description":"Pandoc is a powerful, open-source command-line tool for converting documents between various formats (e.g., Markdown, HTML, LaTeX, PDF, Word). The `pandoc` Python library (version 2.4, released August 7, 2024) provides Python bindings to interact with Pandoc's document model, allowing for in-Python analysis, creation, and transformation of documents. It leverages the underlying Haskell-based Pandoc executable, which must be installed separately. The library generally follows an active release cadence, with updates to support recent Pandoc executable versions.","status":"active","version":"2.4","language":"en","source_language":"en","source_url":"https://github.com/boisgera/pandoc","tags":["document conversion","pandoc","markdown","html","latex","pdf","ast","text processing"],"install":[{"cmd":"pip install --upgrade pandoc","lang":"bash","label":"Install Python library"},{"cmd":"conda install -c conda-forge pandoc","lang":"bash","label":"Install Pandoc executable (Conda)"},{"cmd":"sudo apt install pandoc","lang":"bash","label":"Install Pandoc executable (Debian/Ubuntu)"},{"cmd":"brew install pandoc","lang":"bash","label":"Install Pandoc executable (macOS Homebrew)"}],"dependencies":[{"reason":"The Python 'pandoc' library is a wrapper around the external Pandoc command-line tool, which must be installed separately and accessible in the system's PATH.","package":"pandoc (executable)"},{"reason":"Required for process execution and shell interaction.","package":"plumbum"},{"reason":"Required for parsing.","package":"ply"}],"imports":[{"note":"Main module for reading, writing, and manipulating Pandoc documents.","symbol":"pandoc","correct":"import pandoc"},{"note":"Provides access to the Abstract Syntax Tree (AST) types for fine-grained document manipulation.","symbol":"types","correct":"from pandoc.types import Str, Space, Para, Meta"}],"quickstart":{"code":"import pandoc\nfrom pandoc.types import Str, Space, Para, Meta\n\n# Read a simple markdown string into a Pandoc document object\ntext = \"Hello world!\"\ndoc = pandoc.read(text)\nprint(f\"Initial document: {doc}\")\n\n# Access and modify an element in the document's Abstract Syntax Tree (AST)\n# For \"Hello world!\", doc is Pandoc(Meta({}), [Para([Str('Hello'), Space(), Str('world!')])])\n# The paragraph is at doc[1][0]\n# The 'world!' string is at doc[1][0][2][0]\nparagraph = doc[1][0]\n\n# Modify the 'world!' string to 'Python!'\n# The Str object is at paragraph[2] (0: Str('Hello'), 1: Space(), 2: Str('world!'))\n# The actual string value is the first element of the Str tuple: Str('world!')[0]\nparagraph[2][0] = 'Python!'\n\n# Write the modified document back to a markdown string\nmodified_text = pandoc.write(doc)\nprint(f\"Modified document text: {modified_text.strip()}\")\n\n# Example of converting to a different format (requires actual pandoc executable)\n# doc_to_convert = pandoc.read(\"# My Title\\n\\nHello from Pandoc!\", format='markdown')\n# html_output = pandoc.write(doc_to_convert, format='html')\n# print(f\"HTML output:\\n{html_output}\")","lang":"python","description":"This quickstart demonstrates how to read a Markdown string into a Pandoc document object, access and modify its Abstract Syntax Tree (AST) using `pandoc.types`, and then write the modified document back to a Markdown string. This showcases the core functionality for programmatic document manipulation."},"warnings":[{"fix":"Ensure the Pandoc executable is installed on your system and accessible in the system's PATH before using the Python `pandoc` library.","message":"The Python `pandoc` library is a thin wrapper and does not bundle the Pandoc executable. Users MUST install the Pandoc command-line tool separately (e.g., via `conda install pandoc`, `sudo apt install pandoc`, or `brew install pandoc`). Failure to do so will result in runtime errors as the Python library will not find the `pandoc` binary.","severity":"breaking","affected_versions":"All versions"},{"message":"The `pandoc` Python library should not be confused with `pypandoc`. While both are Python wrappers for Pandoc, `pypandoc` offers a `pypandoc_binary` package that bundles the Pandoc executable, whereas `pandoc` (this library) always requires a separate installation of the underlying Pandoc tool.","severity":"gotcha"},{"fix":"Always pass command-line arguments as separate list items to ensure correct parsing by the Pandoc executable. Utilize `shlex.split()` for robust conversion of command strings to lists if needed.","message":"When programmatically interacting with the Pandoc executable (e.g., via Python's `subprocess` module or the `pandoc` library's underlying calls), command-line arguments, especially those with values, must be passed as distinct items in a list. Combining them into a single string (e.g., `'-Vtitle=\"My Title\"'`) can lead to incorrect parsing by Pandoc. Instead, use `['-V', 'title=\"My Title\"']`.","severity":"gotcha","affected_versions":"All versions when passing options to Pandoc"},{"fix":"Update Markdown documents to use ````{.lang}```` for code block language attributes when targeting Pandoc executable versions 3.1 or newer. If processing older documents, be aware of potential parsing differences.","message":"The underlying Pandoc executable (version 3.1 and later) changed how it parses code block attributes. The syntax ````{lang}` (without a leading dot) is no longer interpreted as a language class but as a literal string. The correct syntax for specifying a language class is ````{.lang}````. This change in the Pandoc executable can affect how the Python `pandoc` library processes markdown documents.","severity":"breaking","affected_versions":"Pandoc executable versions >= 3.1"},{"fix":"Refer to the Python `pandoc` library's documentation or changelog for the Pandoc executable versions it is officially tested against and supports. Ensure your installed Pandoc executable matches a supported version range.","message":"The Python `pandoc` library is tested against specific versions of the Pandoc executable. While it might issue a warning for unsupported Pandoc executable versions instead of failing, using an incompatible version could lead to unexpected behavior or incorrect document transformations due to differences in the underlying document model.","severity":"gotcha","affected_versions":"All versions (when Pandoc executable version mismatch occurs)"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}