mail-parser-reply
mail-parser-reply is a Python library (current version 1.36) designed to parse and split incoming email messages into individual replies. It supports multiple languages and makes it easier to extract relevant text content, with options to strip headers, signatures, and disclaimers. The library is actively maintained, providing an improved, fully type-annotated implementation over older email reply parsing tools.
Common errors
-
TypeError: EmailReplyParser() missing 1 required positional argument: 'languages'
cause The `EmailReplyParser` class requires the `languages` argument during instantiation.fixInstantiate `EmailReplyParser` by providing a list of language codes, e.g., `EmailReplyParser(languages=['en'])`. -
AttributeError: 'EmailReplyParser' object has no attribute 'replies' (or 'parse_reply')
cause Methods like `read()` and `parse_reply()` are instance methods and must be called on an instantiated `EmailReplyParser` object, not directly on the class.fixFirst, create an instance: `parser = EmailReplyParser(languages=['en'])`. Then, call the methods on the instance: `email_message = parser.read(text=mail_body)`.
Warnings
- gotcha Mail clients handle reply formatting in diverse ways, which can make consistent and reliable parsing inherently challenging. The library is designed to mitigate this but edge cases may still exist.
- gotcha Some supported languages (e.g., Czech, Spanish, Korean, Chinese) are explicitly marked as 'untested' in the documentation. Parsing accuracy for these languages may be less reliable than for thoroughly tested ones.
- gotcha The library primarily focuses on 'text-based mail parsing'. While it can handle headers and signatures, complex HTML-rich emails might require prior conversion to plain text for optimal and accurate reply extraction.
Install
-
pip install mail-parser-reply
Imports
- EmailReplyParser
from mail_parser_reply import EmailReplyParser
from mailparser_reply import EmailReplyParser
Quickstart
from mailparser_reply import EmailReplyParser
mail_body = """Awesome! I haven't had another problem with it. Thanks, alfonsrv
On Wed, Dec 20, 2023 at 13:37, RAUSYS <info@rausys.de> wrote:
> The good news is that I've found a much better query for lastLocation.
> It should run much faster now. Can you double-check?
"""
# Instantiate the parser with desired languages
parser = EmailReplyParser(languages=['en', 'de'])
# Parse the entire email and get a list of EmailReply objects
email_message = parser.read(text=mail_body)
print("All replies:")
for reply in email_message.replies:
print(f"- {reply.text}")
# Or get only the latest reply as a string
latest_reply = parser.parse_reply(text=mail_body)
print("\nLatest reply:")
print(latest_reply)