{"id":2114,"library":"markitdown","title":"MarkItDown","description":"MarkItDown is a Python utility library designed for converting various file formats, such as DOCX, PDF, and CSV, into Markdown. It supports different input sources including local files, URLs, and data URIs. The current version is 0.1.5, with an active development cadence featuring regular maintenance and feature releases.","status":"active","version":"0.1.5","language":"en","source_language":"en","source_url":"https://github.com/microsoft/markitdown","tags":["markdown","conversion","document processing","docx","pdf","csv","uri"],"install":[{"cmd":"pip install markitdown","lang":"bash","label":"Basic installation"},{"cmd":"pip install markitdown[all]","lang":"bash","label":"Install with all optional converters"},{"cmd":"pip install markitdown[docx,pdf]","lang":"bash","label":"Install with specific converters (e.g., DOCX and PDF)"}],"dependencies":[{"reason":"Required for DOCX conversion.","package":"mammoth","optional":true},{"reason":"Required for PDF conversion.","package":"pdfminer.six","optional":true},{"reason":"Potentially used for some ML-based parsing components; dependency management has been volatile.","package":"onnxruntime","optional":true}],"imports":[{"symbol":"MarkItDown","correct":"from markitdown import MarkItDown"}],"quickstart":{"code":"from markitdown import MarkItDown\nimport base64\n\nmarkitdown = MarkItDown()\n\n# Convert a data URI containing plain text to Markdown\ntext_content = \"Hello from MarkItDown! This is a test.\\n\\n- Item 1\\n- Item 2\"\nbase64_content = base64.b64encode(text_content.encode('utf-8')).decode('utf-8')\ndata_uri = f\"data:text/plain;base64,{base64_content}\"\n\nresult = markitdown.convert_uri(data_uri)\nprint(f\"Converted Markdown:\\n{result.markdown}\")\n\n# Example of how you would convert a local file (replace with an actual path)\n# try:\n#     result = markitdown.convert_uri(\"file:///path/to/your/document.docx\")\n#     print(f\"Converted DOCX:\\n{result.markdown}\")\n# except Exception as e:\n#     print(f\"Could not convert DOCX: {e} (Ensure you have installed markitdown[docx] and the file exists)\")","lang":"python","description":"Initializes MarkItDown and demonstrates conversion using a data URI. It also includes commented-out code for converting a local file, highlighting the need for correct optional dependency installation."},"warnings":[{"fix":"Install with the specific feature groups needed, e.g., `pip install markitdown[docx]` for DOCX support, or `pip install markitdown[all]` for all converters.","message":"Starting with v0.1.0, MarkItDown introduced a plugin-based architecture and optional dependency groups (e.g., `[docx]`, `[pdf]`, `[all]`). If you only `pip install markitdown`, you may lack converters for specific file types.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"Update calls from `markitdown.convert_url(...)` to `markitdown.convert_uri(...)`.","message":"The `convert_url` method was renamed to `convert_uri` in v0.1.1. While `convert_url` remains an alias for backward compatibility, new code should prefer `convert_uri`.","severity":"deprecated","affected_versions":">=0.1.1"},{"fix":"If `onnxruntime` issues arise, try upgrading/downgrading it, or install `markitdown` in a dedicated virtual environment. Check the `onnxruntime` GitHub for compatibility notes if problems persist.","message":"The `onnxruntime` dependency has seen several changes (pinned in v0.1.3 on Windows, removed upper bound in v0.1.5). Users might encounter `onnxruntime` version conflicts, especially in complex environments.","severity":"gotcha","affected_versions":"All versions >=0.1.3"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}