Dask-GeoPandas

raw JSON →
0.5.0 verified Fri May 01 auth: no python

Parallel GeoPandas with Dask. Extends GeoPandas to work with Dask for parallel and distributed execution of geospatial operations on large datasets. Current version: 0.5.0. Release cadence: irregular, roughly 1-2 releases per year.

pip install dask-geopandas
error ImportError: cannot import name 'read_file' from 'dask_geopandas'
cause Older version of dask-geopandas may not have read_file; or the import path was from a different location.
fix
Upgrade to the latest version: pip install --upgrade dask-geopandas. If still failing, use from dask_geopandas.io import read_file.
error TypeError: 'GeoDataFrame' object does not support item assignment
cause Trying to assign values to a Dask GeoDataFrame as if it were a pandas GeoDataFrame.
fix
Use .assign() or map_partitions() to modify columns lazily.
error distributed.utils_test - ValueError: Input data has no spatial partitions set. Call `ddf = ddf.spatial_shuffle()` first.
cause Spatial join or other spatial operation requires spatial partitions to be set.
fix
Call ddf = ddf.spatial_shuffle() before performing spatial operations like sjoin().
error ModuleNotFoundError: No module named 'dask_expr'
cause Dask's new query planning (>=2024.3.0) requires dask-expr, which may not be installed.
fix
Install dask[dataframe] which includes dask-expr: pip install dask[dataframe].
deprecated The `geom_almost_equals` method has been removed in v0.5.0. Use `geom_equals_exact` instead.
fix Replace calls to `ddf.geom_almost_equals` with `ddf.geom_equals_exact`.
breaking Shapely >=2 is now required; support for PyGEOS has been removed since v0.4.0.
fix Ensure you have Shapely >=2 installed (`pip install shapely>=2`). Uninstall PyGEOS if present.
gotcha `spatial_shuffle` may produce incorrect results if the meta object from `read_file` is not set correctly. Ensure you use the latest version (>=0.4.2) to avoid this bug.
fix Upgrade to dask-geopandas >=0.4.2.
gotcha When using Dask's new query planning (dask >=2024.3.0), you must have dask-expr installed to avoid errors. It is installed automatically with `dask[dataframe]`.
fix Install dask[dataframe] or dask-expr: `pip install dask[dataframe]`.
breaking Dask-GeoPandas now requires Python >=3.10 as of v0.5.0.
fix Upgrade Python to 3.10 or later.

Load a GeoJSON file into a Dask GeoDataFrame lazily.

import geopandas as gpd
from dask_geopandas import read_file

# Read a GeoJSON file in parallel (lazy)
ddf = read_file('path/to/file.geojson')
print(ddf.head())