extract-msg: Outlook MSG File Extractor
extract-msg is a Python library designed to parse and extract emails and their attachments from Microsoft Outlook's proprietary .msg files. It supports various MSG file formats, including embedded messages and complex structures, and can handle different encodings. The library is actively maintained with frequent minor and patch releases, currently at version 0.55.0.
Warnings
- gotcha The default `maxNameLength` for filenames when saving attachments or message data has changed from 256 to 40 characters in version 0.55.0. If you relied on longer filenames by default, your saved files might now be truncated.
- breaking The prepared HTML output (e.g., via `msg.htmlBody`) changed in version 0.54.0 to use plainly encoded HTML instead of a prettified format. If your application parsed or relied on the structure of the prettified HTML, this change may affect you.
- gotcha Prior to version 0.55.0, if `openMsg()` or `Message` (when opening specific OLE files that weren't standard MSG) was used without a context manager (`with...as`), the underlying OLE file handle might not be closed, leading to resource leaks. While `openMsg()` was fixed internally in 0.55.0, it's a good practice to always use context managers.
- gotcha Encoding issues, particularly with child/embedded MSG files and their interaction with the parent's encoding, have been a source of bugs (e.g., fixed in v0.54.1, v0.52.0). While fixes are implemented, be aware that complex nested MSG structures or malformed files can still present encoding challenges.
Install
-
pip install extract-msg
Imports
- Message
from extract_msg import Message
Quickstart
import os
from extract_msg import Message
# Create a dummy .msg file for demonstration
# In a real scenario, you'd replace 'example.msg' with your actual file path
# This part is just to make the example runnable without an actual .msg file present initially
# A real .msg file structure is complex and cannot be simply created like this.
# Assume 'example.msg' exists and contains an Outlook message.
# For testing, you might use a pre-existing sample .msg file.
msg_file_path = 'example.msg'
if not os.path.exists(msg_file_path):
# This part would typically be replaced by pointing to an actual .msg file.
# For a truly runnable example, one would need a sample .msg file.
print(f"Please create a file named '{msg_file_path}' containing a valid Outlook .msg email to run this example.")
print("Using a placeholder for demonstration purposes.")
# Exit or handle gracefully if no .msg file is found for testing.
# For this example, we'll proceed assuming it will fail, or a real file exists.
try:
with Message(msg_file_path) as msg:
print(f"Subject: {msg.subject}")
print(f"Sender: {msg.sender}")
print(f"Date: {msg.date}")
print(f"Body (plain text):\n{msg.body[:200]}...") # Print first 200 chars
if msg.attachments:
print(f"\nAttachments found: {len(msg.attachments)}")
output_dir = 'attachments_output'
os.makedirs(output_dir, exist_ok=True)
for attachment in msg.attachments:
filename = attachment.longFilename or attachment.shortFilename
if filename:
try:
attachment.save(customPath=output_dir, raw=False)
print(f" Saved attachment: {filename}")
except Exception as e:
print(f" Error saving attachment {filename}: {e}")
else:
print("\nNo attachments.")
except FileNotFoundError:
print(f"Error: The file '{msg_file_path}' was not found. Please ensure it exists.")
except Exception as e:
print(f"An error occurred while processing the MSG file: {e}")