{"id":2543,"library":"itemloaders","title":"Itemloaders","description":"Itemloaders is a base library for Scrapy's ItemLoader, providing a robust and flexible way to parse and populate Scrapy Items. It handles data extraction from various sources (XPath, CSS, regular expressions, JMESPath) and processes it through a chain of input and output processors. The current version is 1.4.0, and the library maintains an active release cadence, frequently updating Python version support.","status":"active","version":"1.4.0","language":"en","source_language":"en","source_url":"https://github.com/scrapy/itemloaders","tags":["scrapy","web scraping","data extraction","item processing","html parsing"],"install":[{"cmd":"pip install itemloaders","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Required for parsing and selector functionality (XPath, CSS, JMESPath).","package":"parsel","optional":false}],"imports":[{"symbol":"ItemLoader","correct":"from itemloaders import ItemLoader"},{"symbol":"TakeFirst","correct":"from itemloaders.processors import TakeFirst"},{"symbol":"MapCompose","correct":"from itemloaders.processors import MapCompose"}],"quickstart":{"code":"import re\nfrom itemloaders import ItemLoader\nfrom itemloaders.processors import TakeFirst, MapCompose\n\n# A minimal Scrapy-like Item (often defined as scrapy.Item)\nclass MyItem:\n    def __init__(self, **kwargs):\n        for k, v in kwargs.items():\n            setattr(self, k, v)\n\n    def __repr__(self):\n        return str(self.__dict__)\n\n# Define an ItemLoader for MyItem\nclass ProductLoader(ItemLoader):\n    default_item_class = MyItem\n    default_output_processor = TakeFirst()\n\n    name_in = MapCompose(lambda x: x.strip(), str.title)\n    price_out = MapCompose(lambda x: x.replace('$', ''), float)\n    description_in = MapCompose(lambda x: x.strip())\n\n# Example HTML fragment\nhtml_data = '''\n<div class=\"product\">\n    <h1 class=\"name\">  product a  </h1>\n    <span class=\"price\">$12.99</span>\n    <div class=\"description\">A really good product.</div>\n</div>\n'''\n\n# Using parsel.Selector for data extraction\nfrom parsel import Selector\nselector = Selector(text=html_data)\n\n# Instantiate the loader and populate the item\nloader = ProductLoader(selector=selector)\nloader.add_css('name', '.name::text')\nloader.add_xpath('price', '//span[@class=\"price\"]/text()')\nloader.add_value('description', 'Short description from custom source.') # Add a fixed value\nloader.add_css('description', '.description::text') # Can add multiple sources for the same field\n\n# Load the item\nitem = loader.load_item()\n\nprint(item)\n# Expected output: {'name': 'Product A', 'price': 12.99, 'description': 'A really good product.'}\n","lang":"python","description":"This quickstart demonstrates how to define a simple Item, create an `ItemLoader` inheriting from `itemloaders.ItemLoader`, and use CSS selectors, XPath, and custom processors (`MapCompose`, `TakeFirst`) to extract and process data from an HTML string using `parsel.Selector` to populate the item fields."},"warnings":[{"fix":"Ensure your project's Python version meets the minimum requirements for the `itemloaders` version you are using. Check the release notes for specific version requirements before upgrading.","message":"Python version compatibility has changed frequently, dropping support for older versions. For example, v1.4.0 dropped Python 3.8-3.9, v1.2.0 dropped Python 3.7, and v1.1.0 dropped Python 3.6.","severity":"breaking","affected_versions":">=1.1.0"},{"fix":"Upgrade to version 1.3.1 or newer, which includes a fix for this issue.","message":"Version 1.3.0 introduced a regression where nested loaders would raise an error when encountering empty matches.","severity":"gotcha","affected_versions":"1.3.0"},{"fix":"Upgrade to version 1.0.6 or newer, which fixed this regression. If constrained to 1.0.5, ensure the `re` parameter is always a string pattern, or avoid using compiled patterns.","message":"In version 1.0.5, passing a compiled regular expression pattern (e.g., `re.compile('...')`) to the `re` parameter of methods like `ItemLoader.add_xpath` or `add_css` could cause an exception due to it being passed directly to `lxml`.","severity":"gotcha","affected_versions":"1.0.5"},{"fix":"If using JMESPath features, ensure your `parsel` dependency is explicitly set to `parsel>=1.8.1`.","message":"JMESPath support, introduced in v1.1.0 with methods like `ItemLoader.add_jmes`, requires `parsel` version 1.8.1 or newer. While `itemloaders` itself might declare a lower minimum `parsel` dependency, using JMESPath features necessitates the newer `parsel` version.","severity":"gotcha","affected_versions":">=1.1.0"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}