MSG Parser

1.2.0 · maintenance · verified Thu Apr 16

The `msg-parser` module enables reading, parsing, and converting Microsoft Outlook MSG E-Mail files. It facilitates extracting email properties, handling nested MSG/EML attachments, and outputting message content as JSON strings or EML files. The library is currently at version 1.2.0 (last released December 2019) and is compatible with Python 3.4 and higher.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load an MSG file, access its properties (subject, sender), retrieve the message body, iterate through and save attachments, and convert the MSG file to EML format. It includes basic error handling for file existence.

import os
from msg_parser import MsOxMessage

# Create a dummy MSG file path for demonstration
# In a real scenario, replace 'path/to/your/email.msg' with your actual file.
msg_file_path = os.environ.get('MSG_FILE_PATH', 'path/to/your/email.msg')

if not os.path.exists(msg_file_path):
    print(f"Warning: MSG file not found at '{msg_file_path}'. Cannot run quickstart.")
    print("Please provide a valid MSG file path via MSG_FILE_PATH environment variable or directly.")
else:
    try:
        msg_obj = MsOxMessage(msg_file_path)

        # Get message properties as a dictionary
        properties = msg_obj.get_properties()
        print(f"Subject: {properties.get('subject')}")
        print(f"From: {properties.get('sender_name')}")

        # Get message body (plain text, html, or rtf)
        # The library tries to provide the 'cleanest' body available.
        body = msg_obj.body
        if body:
            print("\nBody snippet:")
            print(body[:200] + '...' if len(body) > 200 else body)

        # Iterate and save attachments
        print(f"\nAttachments found: {len(msg_obj.attachments)}")
        for i, attachment in enumerate(msg_obj.attachments):
            # Ensure an output directory exists for attachments
            output_dir = 'attachments_output'
            os.makedirs(output_dir, exist_ok=True)
            attachment_path = os.path.join(output_dir, attachment.long_filename)
            attachment.save(attachment_path)
            print(f"Saved attachment {i+1}: {attachment_path}")

        # Convert message to EML format and save
        output_eml_path = os.path.join(output_dir, 'output_email.eml')
        msg_obj.save_email_file(output_eml_path)
        print(f"Converted MSG to EML: {output_eml_path}")

        # Get message as JSON string
        # json_string = msg_obj.get_message_as_json()
        # print("\nMessage as JSON (first 500 chars):")
        # print(json_string[:500] + '...')

    except Exception as e:
        print(f"Error processing MSG file: {e}")

view raw JSON →