lxml
raw JSON → 6.0.2 verified Tue May 12 auth: no python install: verified quickstart: stale
lxml is a powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API. The current version is 6.0.2, released on March 28, 2026. It follows a regular release cadence, with recent versions 6.0.1 and 6.0.0 released on March 15, 2026, and March 1, 2026, respectively.
pip install lxml Common errors
error ModuleNotFoundError: No module named 'lxml' ↓
cause The 'lxml' library is not installed in the Python environment you are currently using, or there's a mismatch between the Python interpreter running the code and where 'lxml' was installed.
fix
Install the lxml package using pip:
pip install lxml or pip3 install lxml. If using a virtual environment, ensure it's activated before installation. error lxml.etree.XMLSyntaxError: Start tag expected, '<' not found ↓
cause This error typically occurs when attempting to parse a document that is not well-formed XML, or when trying to parse HTML content using `lxml.etree.fromstring()` instead of the dedicated HTML parser.
fix
If parsing HTML, use
lxml.html.fromstring() instead of lxml.etree.fromstring(). Ensure the input document is valid XML if using the XML parser. Handle malformed documents by using an HTML parser or enabling recovery options where applicable. error ImportError: DLL load failed: The specified module could not be found. ↓
cause This error usually occurs on Windows systems when the underlying C libraries (libxml2 and libxslt) that lxml depends on cannot be found by Python, often due to missing Visual C++ Redistributables or issues with how lxml was built/installed.
fix
Try reinstalling lxml (
pip uninstall lxml then pip install lxml). If the issue persists, ensure you have the correct Microsoft Visual C++ Redistributable for your Python version installed, or try installing a pre-compiled wheel from unofficial sources if binary wheels are not available on PyPI for your Python/OS combination. error ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. ↓
cause This error occurs when you provide a Python Unicode string (str type in Python 3) to `lxml.etree.fromstring()` that also contains an XML declaration specifying an encoding (e.g., `<?xml version="1.0" encoding="utf-8"?>`). The parser expects bytes for such declarations, or a plain Unicode string without the declaration.
fix
Either convert the string to bytes with the specified encoding before parsing (e.g.,
xml_string.encode('utf-8')), or remove the XML encoding declaration from the string if you intend to parse it as a plain Unicode string. Warnings
breaking In version 5.2.0, the 'lxml.html.clean' module was moved to a separate project 'lxml_html_clean'. ↓
fix Install 'lxml_html_clean' separately and update import statements accordingly.
gotcha XPath queries return lists; accessing the first element requires indexing. ↓
fix Use 'result[0]' to access the first element of the XPath result list.
breaking Script failed due to `FileNotFoundError`, indicating that the required input file 'sample.xml' (or the path specified by 'XML_FILE_PATH') was not found in the execution environment. ↓
fix Ensure the 'sample.xml' file is present in the working directory, or set the 'XML_FILE_PATH' environment variable to a valid existing file path before running the script.
breaking The script failed with a FileNotFoundError because 'sample.xml' (or the file specified by the XML_FILE_PATH environment variable) could not be found. This typically means the necessary input file is not present in the execution environment. ↓
fix Ensure that the 'sample.xml' file is present in the working directory or that the `XML_FILE_PATH` environment variable is correctly set to an existing file path within the execution environment.
Install compatibility verified last tested: 2026-05-12
python os / libc status wheel install import disk
3.10 alpine (musl) - - 0.03s 29.7M
3.10 slim (glibc) - - 0.02s 30M
3.11 alpine (musl) - - 0.07s 31.5M
3.11 slim (glibc) - - 0.05s 32M
3.12 alpine (musl) - - 0.06s 23.5M
3.12 slim (glibc) - - 0.06s 24M
3.13 alpine (musl) - - 0.05s 23.2M
3.13 slim (glibc) - - 0.05s 24M
3.9 alpine (musl) - - 0.03s 29.2M
3.9 slim (glibc) - - 0.02s 30M
Imports
- etree
from lxml import etree - html
from lxml import html
Quickstart stale last tested: 2026-04-23
import os
from lxml import etree
# Load XML from a file
with open(os.environ.get('XML_FILE_PATH', 'sample.xml'), 'rb') as f:
tree = etree.parse(f)
# Perform XPath query
result = tree.xpath('//element[@attribute="value"]')
# Process result
for elem in result:
print(etree.tostring(elem))