{"id":6405,"library":"pandas-read-xml","title":"pandas-read-xml (Legacy)","description":"This library, `pandas-read-xml`, provides functionality to read XML files directly into pandas DataFrames. It aims to simplify the process of converting hierarchical XML data into a tabular format, offering options for path specification and automatic flattening. It is important to note that `pandas.read_xml` was introduced into the core pandas library in version 1.3.0, largely superseding the need for this standalone package for newer pandas installations. The latest version of this standalone library is 0.3.1.","status":"maintenance","version":"0.3.1","language":"en","source_language":"en","source_url":"https://github.com/minchulkim87/pandas_read_xml","tags":["xml","pandas","data ingestion","legacy"],"install":[{"cmd":"pip install pandas-read-xml","lang":"bash","label":"Install `pandas-read-xml`"}],"dependencies":[{"reason":"Core data structure for output.","package":"pandas"},{"reason":"Used internally for XML parsing.","package":"xmltodict","optional":false},{"reason":"Recommended for more complex XPath expressions when using the underlying pandas.read_xml or for performance, though not a direct dependency of pandas-read-xml itself.","package":"lxml","optional":true}],"imports":[{"note":"The core pandas library (>=1.3.0) now includes its own `pd.read_xml` function. This standalone package provides `pdx.read_xml`.","wrong":"import pandas as pd\ndf = pd.read_xml(...)","symbol":"read_xml","correct":"import pandas_read_xml as pdx\ndf = pdx.read_xml(...)"}],"quickstart":{"code":"import pandas_read_xml as pdx\nimport io\n\nxml_data = \"\"\"<?xml version='1.0' encoding='utf-8'?>\n<root>\n    <item id=\"1\">\n        <name>Apple</name>\n        <price>1.00</price>\n    </item>\n    <item id=\"2\">\n        <name>Banana</name>\n        <price>0.50</price>\n    </item>\n</root>\"\"\"\n\n# To read from a file, replace io.StringIO(xml_data) with 'path/to/your/file.xml'\ndf = pdx.read_xml(io.StringIO(xml_data), ['root', 'item'])\nprint(df)","lang":"python","description":"Reads a simple XML string or file path into a pandas DataFrame. The second argument specifies the 'root key list' to navigate to the desired data elements."},"warnings":[{"fix":"If using pandas >= 1.3.0, replace `import pandas_read_xml as pdx` and `pdx.read_xml(...)` with `import pandas as pd` and `pd.read_xml(...)`.","message":"The functionality of this library has been incorporated into the main `pandas` library itself as `pandas.read_xml()` since `pandas` version 1.3.0. For new projects or installations with pandas >= 1.3.0, it is generally recommended to use `pd.read_xml()` directly instead of this standalone package.","severity":"breaking","affected_versions":"<1.3.0 (for pandas core), all versions (for this standalone lib)"},{"fix":"Prefer `pandas.read_xml` for robust and actively maintained solutions, especially in production environments.","message":"The `pandas-read-xml` GitHub repository explicitly states: 'Note that this isn't a mature or anything close to a complete solution. So I don't recommend using it in \"production\".' This suggests it was intended as a temporary solution before native pandas support.","severity":"deprecated","affected_versions":"All versions of `pandas-read-xml`"},{"fix":"Thoroughly understand your XML schema. Utilize the `xpath`, `namespaces`, and `stylesheet` parameters (in `pandas.read_xml`) or `root_key_list` (in `pandas-read-xml`) to target specific elements. Consider using `lxml` as the parser for advanced XPath capabilities.","message":"Working with complex or deeply nested XML structures can be challenging. Both `pandas-read-xml` and `pandas.read_xml` might require careful use of XPath expressions, handling of XML namespaces, and potentially pre-processing with XSLT to flatten data.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Experiment with `root_is_rows=False` and `transpose=True` if your initial DataFrame output appears malformed or inverted. Carefully inspect the resulting DataFrame and adjust parameters based on your XML's structure.","message":"The `root_is_rows` and `transpose` arguments in `pandas-read-xml` (and similar logic in `pandas.read_xml`'s `xpath` and structure interpretation) can be tricky. Incorrect usage might lead to a transposed DataFrame or incorrect row/column interpretation if the XML structure doesn't align with the default assumptions.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Utilize the `flatten()` or `auto_flatten()` methods provided by `pandas_read_xml` if encountering inconsistent tag structures. For `pandas.read_xml`, manual post-processing with `json_normalize` (after converting to dict) or `explode` might be necessary.","message":"XML data can sometimes have mixed types within the same tags (e.g., some instances are single elements, others are lists), making flattening difficult. `pandas_read_xml` includes `flatten()` and `auto_flatten()` methods to address this, but it remains a complex issue.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z"}