{"id":5896,"library":"datashader","title":"Datashader","description":"Datashader is a Python library designed for high-performance visualization of very large datasets. It uses GPU-accelerated techniques to aggregate data into a grid, enabling effective rendering of billions of data points. The current version is 0.19.0, and it maintains an active release cadence, often aligning with the broader HoloViz ecosystem.","status":"active","version":"0.19.0","language":"en","source_language":"en","source_url":"https://github.com/holoviz/datashader","tags":["visualization","data-science","big-data","holoviz","gpu"],"install":[{"cmd":"pip install datashader","lang":"bash","label":"Basic Installation"},{"cmd":"pip install datashader[dask,geopandas,cudf]","lang":"bash","label":"With Optional Dependencies"}],"dependencies":[{"reason":"Core dependency for high-performance data aggregation.","package":"numba","optional":false},{"reason":"Required for numerical operations and array handling.","package":"numpy","optional":false},{"reason":"Common data input format, often used with Datashader.","package":"pandas","optional":false},{"reason":"Common data input format for N-dimensional data.","package":"xarray","optional":false},{"reason":"Used for specific reductions and operations like edge bundling.","package":"scipy","optional":false},{"reason":"Required for generating image outputs from shaded aggregations.","package":"pillow","optional":true},{"reason":"Enables processing of out-of-core and distributed datasets.","package":"dask","optional":true},{"reason":"Provides GPU-accelerated dataframes for GPU-native processing.","package":"cudf","optional":true},{"reason":"Enables direct visualization of geospatial data.","package":"geopandas","optional":true}],"imports":[{"symbol":"datashader","correct":"import datashader as ds"},{"symbol":"transfer_functions","correct":"import datashader.transfer_functions as tf"}],"quickstart":{"code":"import datashader as ds\nimport datashader.transfer_functions as tf\nimport pandas as pd\nimport numpy as np\nfrom PIL import Image\n\n# 1. Generate some example data\nnum_points = 100_000\ndata = pd.DataFrame({\n    'x': np.random.normal(0, 1, num_points),\n    'y': np.random.normal(0, 1, num_points),\n    'value': np.random.rand(num_points) # For coloring\n})\n\n# 2. Create a Canvas to define the aggregation grid\ncanvas = ds.Canvas(plot_width=400, plot_height=400)\n\n# 3. Aggregate the data using the mean of 'value'\nagg = canvas.points(data, 'x', 'y', agg=ds.mean('value'))\n\n# 4. Shade the aggregated data into an image\nimg = tf.shade(agg, cmap=['lightblue', 'darkblue'], how='linear')\n\n# To make it runnable and confirm output, convert to PIL Image object\npil_img = img.to_pil()\nassert isinstance(pil_img, Image.Image)\n# In a real application, you would typically save it or display it \n# using a visualization library like HoloViews or Bokeh.\n# pil_img.save(\"datashader_quickstart.png\")\n","lang":"python","description":"This quickstart demonstrates the core Datashader workflow: generating data, defining a Canvas, aggregating data points onto the canvas grid, and then shading the aggregated result into a raster image. The example uses randomly generated data and shades it based on a 'value' column. It concludes by converting the output to a PIL Image object to verify successful execution, which can then be displayed or saved."},"warnings":[{"fix":"Upgrade your Python environment to 3.10 or newer before upgrading Datashader to versions 0.17.0 or later.","message":"Python 3.9 support was officially dropped in v0.18.0. Furthermore, v0.17.0 increased the minimum supported Python version to 3.10.","severity":"breaking","affected_versions":">=0.17.0"},{"fix":"Migrate any CLI-based workflows to use the Python API. Consult the documentation for equivalent programmatic methods.","message":"The Datashader command-line interface (CLI) was removed in v0.19.0. Functionality previously available via the CLI must now be accessed programmatically.","severity":"breaking","affected_versions":">=0.19.0"},{"fix":"If you rely on image output or Dask for large data, install these explicitly: `pip install datashader[dask,pillow]`.","message":"Since v0.17.0, `Pillow` (for image output) and `Dask` (for large datasets) are optional dependencies. They are no longer automatically installed with `pip install datashader`.","severity":"gotcha","affected_versions":">=0.17.0"},{"fix":"If using GeoPandas, upgrade to Datashader v0.16.0 or newer to leverage direct GeoDataFrame support and remove SpatialPandas conversion steps.","message":"Prior to v0.16.0, GeoPandas GeoDataFrames often required conversion to SpatialPandas before use with Datashader. V0.16.0 introduced direct support for many GeoPandas geometry types (e.g., LineString, Polygon) in `Canvas` functions, simplifying geospatial workflows.","severity":"gotcha","affected_versions":"<0.16.0"},{"fix":"Upgrade to Datashader v0.18.2 or newer to resolve the quadmesh segmentation fault issue.","message":"A bug causing a segmentation fault during `quadmesh` reduction, specifically when array sizes were exceeded, was present in versions up to 0.18.1 and fixed in v0.18.2.","severity":"gotcha","affected_versions":">=0.18.0, <0.18.2"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}