{"id":1599,"library":"olefile","title":"olefile - OLE2 File Parser","description":"The olefile library is a Python package designed to parse, read, and write Microsoft OLE2 files, also known as Structured Storage or Compound Documents. These files are commonly used in older Microsoft Office formats (e.g., .doc, .xls, .ppt, .msg) and provide a file system within a file. It offers low-level access to streams and storages. The current version is 0.47, and the library maintains a stable release cadence with updates focused on bug fixes and robustness.","status":"active","version":"0.47","language":"en","source_language":"en","source_url":"https://github.com/decalage2/olefile","tags":["ole","office","document","parsing","structured-storage","file-format","ms-office"],"install":[{"cmd":"pip install olefile","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"symbol":"olefile","correct":"import olefile"},{"note":"Can also be accessed via olefile.OleFileIO after 'import olefile'.","symbol":"OleFileIO","correct":"from olefile import OleFileIO"}],"quickstart":{"code":"import olefile\nimport os\n\n# For a real test, replace 'path/to/your/document.doc' with an actual OLE file path.\n# This example uses a placeholder path and demonstrates the basic API.\n# If the file does not exist or is not an OLE file, appropriate messages will be printed.\n\nole_file_path = 'example.doc' # Replace with a path to a real OLE file\n\nif olefile.isOleFile(ole_file_path):\n    try:\n        # Open the OLE file\n        ole = olefile.OleFileIO(ole_file_path)\n\n        print(f\"Opened OLE file: {ole_file_path}\")\n\n        # List all streams and storages\n        print(\"\\nStreams and Storages:\")\n        for stream_path in ole.listdir():\n            print(f\"- {stream_path}\")\n\n        # Example: check if a specific stream exists and read its content\n        target_stream = ['WordDocument'] # Common stream in Word docs\n        if ole.exists(target_stream):\n            # Read stream content (returns bytes)\n            data = ole.openstream(target_stream).read()\n            print(f\"\\nContent of '{'/'.join(target_stream)}' (first 100 bytes):\")\n            print(data[:100])\n        else:\n            print(f\"\\nStream '{'/'.join(target_stream)}' not found.\")\n\n        # Close the file when done\n        ole.close()\n\n    except Exception as e:\n        print(f\"Error processing OLE file '{ole_file_path}': {e}\")\nelif os.path.exists(ole_file_path):\n    print(f\"'{ole_file_path}' exists but is not a valid OLE file.\")\nelse:\n    print(f\"'{ole_file_path}' does not exist.\")\n    print(\"Please provide a valid path to a Microsoft OLE2 Structured Storage file for testing.\")\n","lang":"python","description":"This quickstart demonstrates how to open an OLE file, check if it's a valid OLE structure, list its internal streams and storages, and read content from a specific stream. Replace 'example.doc' with the actual path to your OLE file (e.g., a .doc, .xls, .ppt, or .msg file) for a real test."},"warnings":[{"fix":"Use `ole.getproperties()` or `ole.get_metadata()` (if applicable for your OLE file type and stream structure) to retrieve metadata and properties. The `get_metadata()` method was introduced for a more structured approach.","message":"The `OleFileIO.meta` attribute was removed in version 0.43. Previously, this attribute provided access to OLE properties (e.g., Author, Title). Direct access to these properties using `ole.meta` will now raise an `AttributeError`.","severity":"breaking","affected_versions":"<=0.42"},{"fix":"Upgrade to `olefile` version 0.45.1 or newer. Always combine `isOleFile()` with robust error handling around `OleFileIO` instantiation, especially when dealing with untrusted input files.","message":"The `olefile.isOleFile()` function could return false positives for certain non-OLE files in versions prior to 0.45.1, incorrectly identifying them as OLE files. This could lead to parsing errors or unexpected behavior when attempting to open such files with `OleFileIO`.","severity":"gotcha","affected_versions":"<0.45.1"},{"fix":"Ensure that any OLE files you intend to parse are unencrypted. If encryption is present, you would need to decrypt the file using external tools or libraries before processing it with `olefile`.","message":"The `olefile` library does not inherently support parsing encrypted or password-protected OLE files. While it may be able to parse the high-level structure, the actual data streams within such files will remain encrypted and unreadable by `olefile`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}