EML Parser Library
eml-parser is a Python library designed for parsing EML (Email Message) files, extracting headers, body content, attachments, and other email components into a structured dictionary. It is currently at version 3.0.0 and maintains an active development pace with releases for new features, bug fixes, and Python version compatibility.
Common errors
-
ModuleNotFoundError: No module named 'eml_parser'
cause The `eml-parser` library is not installed in your current Python environment, or the environment is not active.fixRun `pip install eml-parser` to install the package. -
SyntaxError: invalid syntax (some_file.py, line X)
cause You are attempting to run `eml-parser` version 3.0.0 or higher on a Python version older than 3.10. The library uses modern Python 3.10+ typing syntax.fixUpgrade your Python interpreter to version 3.10 or newer. Alternatively, if you cannot upgrade Python, install an older compatible version of the library: `pip install eml-parser<3.0.0`. -
KeyError: 'subject'
cause The expected key (e.g., 'subject', 'from') does not exist in the parsed EML header or body dictionary for the specific email being processed. This often happens with malformed or unusual EML files.fixAlways check for the existence of keys before accessing them, especially when dealing with potentially non-standard EML files. Use `.get()` with a default value, or wrap access in a `try-except KeyError` block, or use `if key in dict:`.
Warnings
- breaking Version 3.0.0 and newer require Python 3.10 or higher. Running on older Python versions will result in `SyntaxError` or `ModuleNotFoundError` for incompatible type hints.
- breaking The typing syntax was upgraded for Python 3.10+ in v3.0.0. If you have custom code extending or directly interacting with internal types, it might require adjustments.
- gotcha Version 2.0.0 introduced support for custom parsing policies. If your application relies on specific parsing behaviors that might have been influenced by new policy options, review if a custom policy is needed.
- gotcha Starting from v1.17.0, new validation options were added for URLs, email addresses, and IP addresses (e.g., Public Suffix List, `ip_force_routable`, `domain_force_tld`). This might cause previously parsed 'valid' items to be filtered out.
Install
-
pip install eml-parser
Imports
- EmlParser
from eml_parser import EmlParser
Quickstart
from eml_parser import EmlParser
eml_content = """
From: sender@example.com
To: receiver@example.com
Subject: Test Email
Date: Thu, 1 Jan 2023 12:00:00 +0000
Content-Type: text/plain; charset="utf-8"
This is a sample email body.
"""
# Initialize the parser
parser = EmlParser()
# Parse the EML content from a string
parsed_eml = parser.parse_from_string(eml_content)
# Print some extracted data
print(f"From: {parsed_eml['header']['from']}")
print(f"Subject: {parsed_eml['header']['subject']}")
if parsed_eml['body']:
print(f"Body content: {parsed_eml['body'][0]['content']}")
# Example for parsing from a file
# with open('path/to/your.eml', 'rb') as f:
# file_content = f.read()
# parsed_eml_file = parser.parse_from_string(file_content.decode('utf-8', errors='ignore'))
# print(f"File Subject: {parsed_eml_file['header']['subject']}")