Datashader

0.19.0 · active · verified Tue Apr 14

Datashader is a Python library designed for high-performance visualization of very large datasets. It uses GPU-accelerated techniques to aggregate data into a grid, enabling effective rendering of billions of data points. The current version is 0.19.0, and it maintains an active release cadence, often aligning with the broader HoloViz ecosystem.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core Datashader workflow: generating data, defining a Canvas, aggregating data points onto the canvas grid, and then shading the aggregated result into a raster image. The example uses randomly generated data and shades it based on a 'value' column. It concludes by converting the output to a PIL Image object to verify successful execution, which can then be displayed or saved.

import datashader as ds
import datashader.transfer_functions as tf
import pandas as pd
import numpy as np
from PIL import Image

# 1. Generate some example data
num_points = 100_000
data = pd.DataFrame({
    'x': np.random.normal(0, 1, num_points),
    'y': np.random.normal(0, 1, num_points),
    'value': np.random.rand(num_points) # For coloring
})

# 2. Create a Canvas to define the aggregation grid
canvas = ds.Canvas(plot_width=400, plot_height=400)

# 3. Aggregate the data using the mean of 'value'
agg = canvas.points(data, 'x', 'y', agg=ds.mean('value'))

# 4. Shade the aggregated data into an image
img = tf.shade(agg, cmap=['lightblue', 'darkblue'], how='linear')

# To make it runnable and confirm output, convert to PIL Image object
pil_img = img.to_pil()
assert isinstance(pil_img, Image.Image)
# In a real application, you would typically save it or display it 
# using a visualization library like HoloViews or Bokeh.
# pil_img.save("datashader_quickstart.png")

view raw JSON →