cssselect: CSS Selectors for Python

1.4.0 · active · verified Sun Mar 29

cssselect is a BSD-licensed Python library that parses CSS3 Selectors and translates them into XPath 1.0 expressions. These XPath expressions can then be used with an XPath engine like lxml to find matching elements in XML or HTML documents. The library is currently at version 1.4.0 and maintains an active development cycle with releases published on PyPI.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `cssselect` to translate a CSS selector into an XPath 1.0 expression and then apply it to an HTML document using `lxml` to find matching elements. It highlights the use of `HTMLTranslator` for HTML-specific translations.

from lxml.etree import fromstring
from cssselect import HTMLTranslator, SelectorError

html_doc = '''
<div id="outer">
  <p class="content">
    <span>Text 1</span>
  </p>
  <div id="inner" class="content body">
    Text 2
    <span>Text 3</span>
  </div>
</div>
'''

try:
    # Use HTMLTranslator for HTML documents for better pseudo-class handling
    translator = HTMLTranslator()
    xpath_expression = translator.css_to_xpath('div.content > span')
    print(f"Generated XPath: {xpath_expression}")

    document = fromstring(html_doc)
    # Find all elements matching the XPath expression
    matches = document.xpath(xpath_expression)

    for element in matches:
        print(f"Matched element tag: {element.tag}, text: {element.text.strip() if element.text else ''}")

except SelectorError as e:
    print(f"Invalid CSS selector: {e}")

view raw JSON →