Missingno

0.5.2 · active · verified Thu Apr 16

Missingno is a Python library, version 0.5.2, designed for visualizing missing data in datasets. It offers a small toolset of flexible and easy-to-use visualizations including matrix, bar, heatmap, and dendrogram plots, allowing users to quickly gain a visual summary of data completeness. It is actively maintained with recent releases addressing compatibility and adding features.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to create a Pandas DataFrame with simulated missing values and then visualize them using `missingno.matrix` and `missingno.bar`. The matrix plot provides a visual summary of missing data patterns, while the bar chart shows the count of non-null values per column.

import pandas as pd
import numpy as np
import missingno as msno
import matplotlib.pyplot as plt

# Create a sample DataFrame with missing values
data = {
    'A': [1, 2, np.nan, 4, 5],
    'B': [np.nan, 2, 3, 4, np.nan],
    'C': [1, 2, 3, np.nan, 5],
    'D': [1, 2, 3, 4, 5]
}
df = pd.DataFrame(data)

print("DataFrame with missing values:")
print(df)
print("\nMissingno Matrix Visualization:")

# Generate a missingness matrix plot
msno.matrix(df, figsize=(8, 4))
plt.title('Missing Data Matrix')
plt.show()

print("\nMissingno Bar Chart Visualization:")
# Generate a bar chart of missingness
msno.bar(df, figsize=(8, 4))
plt.title('Missing Data Bar Chart')
plt.show()

view raw JSON →