{"id":8334,"library":"msg-parser","title":"MSG Parser","description":"The `msg-parser` module enables reading, parsing, and converting Microsoft Outlook MSG E-Mail files. It facilitates extracting email properties, handling nested MSG/EML attachments, and outputting message content as JSON strings or EML files. The library is currently at version 1.2.0 (last released December 2019) and is compatible with Python 3.4 and higher.","status":"maintenance","version":"1.2.0","language":"en","source_language":"en","source_url":"https://github.com/vikramarsid/msg_parser","tags":["email","outlook","msg","parser","file-format","office-docs"],"install":[{"cmd":"pip install msg_parser","lang":"bash","label":"Basic Installation"},{"cmd":"pip install msg_parser[rtf]","lang":"bash","label":"With RTF Decompression"}],"dependencies":[{"reason":"Required for parsing the OLE2 Compound Document Format of MSG files.","package":"olefile","optional":false},{"reason":"Needed for decompressing RTF-encoded email bodies, installed via the `[rtf]` extra.","package":"rtf (via extra)","optional":true}],"imports":[{"note":"This is the primary class for interacting with MSG files.","symbol":"MsOxMessage","correct":"from msg_parser import MsOxMessage"}],"quickstart":{"code":"import os\nfrom msg_parser import MsOxMessage\n\n# Create a dummy MSG file path for demonstration\n# In a real scenario, replace 'path/to/your/email.msg' with your actual file.\nmsg_file_path = os.environ.get('MSG_FILE_PATH', 'path/to/your/email.msg')\n\nif not os.path.exists(msg_file_path):\n    print(f\"Warning: MSG file not found at '{msg_file_path}'. Cannot run quickstart.\")\n    print(\"Please provide a valid MSG file path via MSG_FILE_PATH environment variable or directly.\")\nelse:\n    try:\n        msg_obj = MsOxMessage(msg_file_path)\n\n        # Get message properties as a dictionary\n        properties = msg_obj.get_properties()\n        print(f\"Subject: {properties.get('subject')}\")\n        print(f\"From: {properties.get('sender_name')}\")\n\n        # Get message body (plain text, html, or rtf)\n        # The library tries to provide the 'cleanest' body available.\n        body = msg_obj.body\n        if body:\n            print(\"\\nBody snippet:\")\n            print(body[:200] + '...' if len(body) > 200 else body)\n\n        # Iterate and save attachments\n        print(f\"\\nAttachments found: {len(msg_obj.attachments)}\")\n        for i, attachment in enumerate(msg_obj.attachments):\n            # Ensure an output directory exists for attachments\n            output_dir = 'attachments_output'\n            os.makedirs(output_dir, exist_ok=True)\n            attachment_path = os.path.join(output_dir, attachment.long_filename)\n            attachment.save(attachment_path)\n            print(f\"Saved attachment {i+1}: {attachment_path}\")\n\n        # Convert message to EML format and save\n        output_eml_path = os.path.join(output_dir, 'output_email.eml')\n        msg_obj.save_email_file(output_eml_path)\n        print(f\"Converted MSG to EML: {output_eml_path}\")\n\n        # Get message as JSON string\n        # json_string = msg_obj.get_message_as_json()\n        # print(\"\\nMessage as JSON (first 500 chars):\")\n        # print(json_string[:500] + '...')\n\n    except Exception as e:\n        print(f\"Error processing MSG file: {e}\")","lang":"python","description":"This quickstart demonstrates how to load an MSG file, access its properties (subject, sender), retrieve the message body, iterate through and save attachments, and convert the MSG file to EML format. It includes basic error handling for file existence."},"warnings":[{"fix":"Process large files sequentially, consider available memory, or explore alternative libraries (e.g., `extract-msg`) if memory becomes a bottleneck. Be mindful of system resources when dealing with numerous or massive MSG files.","message":"The library reads the entire MSG file into memory for parsing. For very large MSG files or batch processing of many files, this can lead to high memory consumption, potentially causing `MemoryError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always wrap file parsing in `try-except` blocks to gracefully handle potential parsing errors. Validate input file integrity where possible, or use more robust error handling for unexpected file structures.","message":"Parsing malformed or corrupted MSG files can lead to `Exception`s such as 'Invalid MSG file provided, 'properties_version1.0' stream data is empty.' or unexpected behavior. The library expects a correctly structured OLE2 Compound Document.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Test thoroughly with your specific MSG file types and Python versions. If you encounter issues with newer Python versions or complex/recent MSG features, you might need to contribute to the project or consider alternative, more actively maintained libraries.","message":"The library's last release was in December 2019. While functional, it may not receive updates for new Python versions (beyond current compatibility with 3.4+) or support the very latest intricacies of Microsoft Outlook's MSG format, which can evolve.","severity":"gotcha","affected_versions":"<=1.2.0"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure the input file is a valid and intact Outlook MSG file. Verify the file path is correct and the file isn't zero-byte. If files are generated by third-party tools, check their output for compliance.","cause":"The MSG file is either corrupted, empty, or does not conform to the expected OLE2 Compound Document format for Outlook messages, specifically missing the '__properties_version1.0' stream or having it empty.","error":"Exception: Invalid MSG file provided, 'properties_version1.0' stream data is empty."},{"fix":"Double-check the `msg_file_path` variable. Ensure the file exists at that exact location and that you have read permissions. Use `os.path.exists()` for pre-check or provide an absolute path.","cause":"The specified path to the MSG file does not exist or is incorrect. Python cannot find the file to open it.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'path/to/your/email.msg'"},{"fix":"The `msg-parser` library generally handles encodings, but if this error occurs, it might be an edge case. Consider trying to explicitly specify encoding if the library allowed it (which `MsOxMessage` constructor doesn't directly expose for the main file content). Inspect the problematic part of the file (if possible) for its true encoding.","cause":"The library or an underlying component (e.g., `olefile`) is attempting to decode a byte string using UTF-8, but the content is in a different encoding (e.g., CP1252, Shift_JIS, or a different Unicode encoding) or contains invalid byte sequences for UTF-8. This often happens with non-English characters in older MSG files.","error":"UnicodeDecodeError: 'utf-8' codec can't decode byte 0x__ in position __: invalid start byte"}]}