{"id":4961,"library":"htmldocx","title":"HTML to DOCX Converter (htmldocx)","description":"The `htmldocx` library provides functionality to convert HTML content into DOCX format, building upon `python-docx` and `beautifulsoup4`. While its last release was in August 2021, it is considered to be in a maintenance state, with more actively developed forks available that address limitations and bugs present in this version.","status":"maintenance","version":"0.0.6","language":"en","source_language":"en","source_url":"https://github.com/pqzx/html2docx","tags":["html","docx","convert","document","word","python-docx"],"install":[{"cmd":"pip install htmldocx","lang":"bash","label":"Install `htmldocx`"}],"dependencies":[{"reason":"Required for creating and manipulating DOCX files.","package":"python-docx","optional":false},{"reason":"Required for parsing HTML content.","package":"beautifulsoup4","optional":false}],"imports":[{"symbol":"HtmlToDocx","correct":"from htmldocx import HtmlToDocx"}],"quickstart":{"code":"from docx import Document\nfrom htmldocx import HtmlToDocx\n\ndocument = Document()\nnew_parser = HtmlToDocx()\n\nhtml_content = '<h1>Hello world</h1><p>This is a paragraph.</p>'\n\n# Add HTML to an existing Document object\nnew_parser.add_html_to_document(html_content, document)\n\n# Save the document\ndocument.save('your_file_name.docx')\n\n# Or convert a file directly\n# new_parser.parse_html_file('input.html', 'output.docx')\n\n# Or convert from an HTML string to a new docx object\n# docx_object = new_parser.parse_html_string('<h2>Another title</h2>')\n# docx_object.save('another_file.docx')","lang":"python","description":"Initialise `HtmlToDocx` and use `add_html_to_document` to insert HTML into a `python-docx` Document object, or use `parse_html_file` / `parse_html_string` for direct conversion."},"warnings":[{"fix":"Consider using more actively maintained forks or alternative libraries for robust HTML to DOCX conversion, such as `html-for-docx`.","message":"The `htmldocx` package has not been updated since August 2021. This means it may lack modern HTML rendering features, bug fixes, or compatibility updates present in more recently developed alternatives or forks.","severity":"gotcha","affected_versions":"<=0.0.6"},{"fix":"Test thoroughly with your specific HTML inputs. For advanced styling or complex layouts, prepare to use workarounds or explore alternative libraries.","message":"Developers who have forked this project (e.g., `html-for-docx`) have cited \"limitations and bugs\" in the original `pqzx/html2docx` codebase (which `htmldocx` is based on) that prevented them from completing tasks. Users may encounter similar rendering issues with complex HTML structures or specific CSS styles.","severity":"gotcha","affected_versions":"<=0.0.6"},{"fix":"Set `new_parser.table_style = 'Light Shading Accent 4'` or another valid style (e.g., 'TableGrid') before adding HTML content.","message":"Tables are not styled by default when converted. To apply styles like borders or shading, you must explicitly set the `table_style` attribute on the `HtmlToDocx` parser instance.","severity":"gotcha","affected_versions":"<=0.0.6"},{"fix":"Set the `paragraph_style` attribute on the `HtmlToDocx` parser instance to apply a default style to all paragraphs if needed (e.g., `new_parser.paragraph_style = 'Normal'`).","message":"No specific style is applied to paragraphs by default. While additional styling defined in HTML will be applied, a base paragraph style is not automatically set.","severity":"gotcha","affected_versions":"<=0.0.6"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}