lxml
lxml is a powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API. The current version is 6.0.2, released on March 28, 2026. It follows a regular release cadence, with recent versions 6.0.1 and 6.0.0 released on March 15, 2026, and March 1, 2026, respectively.
Warnings
- breaking In version 5.2.0, the 'lxml.html.clean' module was moved to a separate project 'lxml_html_clean'.
- gotcha XPath queries return lists; accessing the first element requires indexing.
Install
-
pip install lxml
Imports
- etree
from lxml import etree
- html
from lxml import html
Quickstart
import os
from lxml import etree
# Load XML from a file
with open(os.environ.get('XML_FILE_PATH', 'sample.xml'), 'rb') as f:
tree = etree.parse(f)
# Perform XPath query
result = tree.xpath('//element[@attribute="value"]')
# Process result
for elem in result:
print(etree.tostring(elem))