pandocfilters
pandocfilters is a Python library that provides utilities for writing Pandoc filters. These filters manipulate Pandoc's Abstract Syntax Tree (AST) in its JSON representation between the reader (parser) and writer (output format). It is currently at version 1.5.1 and sees an active, though irregular, release cadence focused on maintenance and compatibility with Pandoc versions.
Warnings
- breaking Python 2 support was officially dropped after `pandocfilters` version 1.5.0. If you are on Python 2, you must use `pandocfilters` <= 1.5.0.
- breaking Compatibility with the underlying `pandoc` executable is critical and can break filters due to AST changes. Specific `pandocfilters` versions are tied to `pandoc` versions.
- gotcha The `get_filename4code` utility (used for generating filenames for code blocks, e.g., for images) by default creates temporary directories that are NOT automatically cleaned up. This can lead to an accumulation of files if not managed.
- gotcha Starting from version 1.5.0, the `examples/` directory is no longer included in the PyPI distribution (source or binary wheels). It is only available in the source repository on GitHub.
- gotcha Pandoc itself offers built-in Lua filters (since Pandoc 2.0) which do not require external language interpreters (like Python) and may offer better performance for some use cases. An alternative Python library, `panflute`, also exists, offering a more 'Pythonic' API.
Install
-
pip install pandocfilters
Imports
- toJSONFilter
from pandocfilters import toJSONFilter
- walk
from pandocfilters import walk
- Str, Para, Header, Emph
from pandocfilters import Str, Para, Header, Emph
Quickstart
#!/usr/bin/env python
"""
Pandoc filter to convert all regular text ('Str' elements) to uppercase.
Run with: pandoc input.md --filter ./caps_filter.py -o output.html
"""
from pandocfilters import toJSONFilter, Str
def caps(key, value, format, meta):
if key == 'Str':
return Str(value.upper())
if __name__ == "__main__":
toJSONFilter(caps)