Xarray
Xarray (pronounced 'ex-array') is an open-source Python package that simplifies working with labelled multi-dimensional arrays and datasets. It introduces labels in the form of dimensions, coordinates, and attributes on top of raw NumPy-like arrays, enabling a more intuitive and less error-prone experience for scientific computing and data analysis, particularly for earth sciences. As of February 2026, the current version is 2026.2.0. Xarray maintains a regular release cadence, with minor versions typically released monthly or bi-monthly.
Warnings
- breaking Default attribute preservation behavior changed in `xarray` v2025.11.0. All operations now preserve attributes by default. Previously, attributes were dropped unless `keep_attrs=True` was explicitly set. Binary operations now combine attributes using `drop_conflicts` instead of keeping only the left operand's attributes.
- breaking Direct application of certain NumPy ufuncs to `xarray.DataArray` objects may now raise `NotImplementedError` due to the `__array_ufunc__` protocol. This affects ufuncs that previously implicitly converted `DataArray` to a NumPy array.
- breaking The behavior of `Dataset.identical()`, `DataArray.identical()`, and `testing.assert_identical()` changed to include comparison of indexes. Two objects with identical data but different indexes will no longer be considered identical.
Install
-
pip install xarray -
pip install "xarray[complete]"
Imports
- DataArray
import xarray as xr; da = xr.DataArray(...)
- Dataset
import xarray as xr; ds = xr.Dataset(...)
- xray
import xarray as xr
Quickstart
import xarray as xr
import numpy as np
import pandas as pd
# Create a DataArray
data_array = xr.DataArray(
np.random.rand(2, 3),
coords={"x": [10, 20], "y": ["a", "b", "c"]},
dims=("x", "y"),
name="random_data"
)
# Create a Dataset with two DataArrays sharing coordinates
temp = xr.DataArray(
25 + 10 * np.random.randn(2, 3, 4),
coords={
"time": pd.to_datetime(["2026-01-01", "2026-01-02"]),
"lat": [40, 50],
"lon": [100, 110, 120, 130]
},
dims=("time", "lat", "lon"),
name="temperature",
attrs={"units": "Celsius", "long_name": "Air Temperature"}
)
precip = xr.DataArray(
5 * np.random.rand(2, 3, 4),
coords=temp.coords, # Share coordinates from temp
dims=temp.dims,
name="precipitation",
attrs={"units": "mm", "long_name": "Precipitation Rate"}
)
dataset = xr.Dataset({"temp": temp, "precip": precip})
# Perform a simple operation (e.g., mean over 'time' dimension)
mean_temp = dataset["temp"].mean(dim="time")
print("DataArray:\n", data_array)
print("\nDataset:\n", dataset)
print("\nMean temperature over time:\n", mean_temp)