Missingno
Missingno is a Python library, version 0.5.2, designed for visualizing missing data in datasets. It offers a small toolset of flexible and easy-to-use visualizations including matrix, bar, heatmap, and dendrogram plots, allowing users to quickly gain a visual summary of data completeness. It is actively maintained with recent releases addressing compatibility and adding features.
Common errors
-
AttributeError: module 'missingno' has no attribute 'geoplot'
cause The `geoplot` method was removed in `missingno` version 0.5.0.fixRemove the call to `msno.geoplot()`. For geospatial data, use a dedicated library like `geopandas`. -
TypeError: matrix() got an unexpected keyword argument 'inline'
cause The `inline` parameter was removed from `missingno` visualization functions in version 0.5.0.fixRemove the `inline=True` or `inline=False` argument from your `missingno` function calls. -
TypeError: bar() got an unexpected keyword argument 'sort'
cause The `sort` parameter's functionality and acceptance for certain plot types like `dendrogram` (and implicitly `bar` in older contexts) changed or was removed in `missingno` version 0.4.2.fixCheck the documentation for the specific `missingno` plot function. For `dendrogram`, remove the `sort` parameter entirely. For `bar`, ensure your `sort` parameter aligns with current valid options (e.g., 'ascending', 'descending'). -
ValueError: Invalid 'kind' argument for plot_nullity. Must be one of ['matrix', 'bar', 'heatmap', 'dendrogram'].
cause Attempting to call a non-existent or misspelled plot kind, possibly from older documentation or misremembered functionality.fixEnsure the plot function name is correct (e.g., `msno.matrix()`, `msno.bar()`, `msno.heatmap()`, `msno.dendrogram()`). The library does not have a generic `plot_nullity` function with a `kind` argument.
Warnings
- breaking The `geoplot` method and the `inline` parameter for visualizations were removed in `missingno` version 0.5.0. Code relying on these will break.
- deprecated The `sort` parameter's behavior changed significantly and was removed from `dendrogram` and `geoplot` in version 0.4.2. Using it in older ways for these plots will not work as expected or raise errors in newer versions.
- gotcha Older versions of `missingno` (prior to 0.5.2) may experience compatibility issues with newer versions of `matplotlib`, leading to visual glitches or errors.
- gotcha When using the `ax` parameter to plot `msno.matrix` onto a `matplotlib.axes.Axes` object, the `sparkline` parameter is not supported and will be ignored or cause issues.
- gotcha Missingno is primarily designed to work with Pandas DataFrames. Attempting to use it directly with other data structures (e.g., raw NumPy arrays or lists) will require conversion to a DataFrame first.
Install
-
pip install missingno
Imports
- missingno
import missingno as msno
- pandas
import pandas as pd
- numpy
import numpy as np
Quickstart
import pandas as pd
import numpy as np
import missingno as msno
import matplotlib.pyplot as plt
# Create a sample DataFrame with missing values
data = {
'A': [1, 2, np.nan, 4, 5],
'B': [np.nan, 2, 3, 4, np.nan],
'C': [1, 2, 3, np.nan, 5],
'D': [1, 2, 3, 4, 5]
}
df = pd.DataFrame(data)
print("DataFrame with missing values:")
print(df)
print("\nMissingno Matrix Visualization:")
# Generate a missingness matrix plot
msno.matrix(df, figsize=(8, 4))
plt.title('Missing Data Matrix')
plt.show()
print("\nMissingno Bar Chart Visualization:")
# Generate a bar chart of missingness
msno.bar(df, figsize=(8, 4))
plt.title('Missing Data Bar Chart')
plt.show()