scatterd
scatterd is a Python package designed for the easy and fast creation of beautiful scatter plots. It simplifies the process of data visualization, currently at version 1.4.2, with an active release cadence.
Common errors
-
ModuleNotFoundError: No module named 'scatterd'
cause The 'scatterd' package is not installed in the current Python environment.fixRun `pip install scatterd` to install the package. -
TypeError: 'numpy.ndarray' object is not callable (or similar for non-numeric data)
cause Plotting functions expect numeric data for X and Y axes. If non-numeric data (e.g., strings) are passed, a TypeError will occur.fixEnsure that the data arrays passed to the `x` and `y` parameters are numeric (integers or floats). Convert data types using `df['column'].astype(float)` or similar methods if necessary.
Warnings
- gotcha Be mindful of overplotting when dealing with large datasets or discrete variables. Multiple observations at the same coordinates may appear as a single point, misleading visual interpretation. Consider density plots or alpha blending for such cases.
- gotcha Correlation shown in a scatter plot does not imply causation. A strong visual correlation between two variables does not mean one causes the other.
- gotcha Ignoring outliers or misinterpreting the scale of axes can lead to incorrect conclusions about data trends and patterns. Discrepancies in scale or non-uniform scaling can distort the perceived relationships.
Install
-
pip install scatterd
Imports
- scatterd
import scatterd
from scatterd import scatterd
Quickstart
import pandas as pd
import numpy as np
from scatterd import scatterd
import matplotlib.pyplot as plt
# Sample data
np.random.seed(42)
data = {
'X': np.random.rand(100) * 10,
'Y': np.random.rand(100) * 10 + np.random.rand(100) * 2,
'Size': np.random.rand(100) * 50 + 10, # Example for variable point sizes
'Category': np.random.choice(['A', 'B', 'C'], 100)
}
df = pd.DataFrame(data)
# Create a basic scatter plot using the scatterd function
# Assuming scatterd function accepts common plotting arguments like Matplotlib's scatter
scatterd(x=df['X'], y=df['Y'], title='My First Scatter Plot',
xlabel='Feature X', ylabel='Feature Y', s=df['Size'], c=df['Category'],
colorbar_title='Category' # Example for categorical coloring
)
# Display the plot. scatterd often leverages Matplotlib internally.
plt.show()