Gower
raw JSON → 0.1.2 verified Fri May 01 auth: no python
Python implementation of Gower's distance for mixed numerical and categorical data, computing pairwise distances between records in two datasets. Version 0.1.2 is the current and only release, with no active development since 2020. Requires Python >=2.7.
pip install gower Common errors
error AttributeError: module 'gower' has no attribute 'gower' ↓
cause Incorrect import: attempting to use `import gower; gower.gower(...)` instead of the correct `from gower import gower_matrix`.
fix
Use
from gower import gower_matrix and call gower_matrix(X, Y). error ValueError: Data must be 2-dimensional ↓
cause Passing a 1D array or Series as input. The function expects 2D array-like (DataFrame or 2D numpy array).
fix
If you have a single feature, reshape to 2D:
X[['col']] or X.values.reshape(-1, 1). Warnings
gotcha The library handles missing values (NaN) by ignoring the feature for that pair (distance contribution set to 0). This may not be expected; no parameter to change this behaviour. ↓
fix Impute missing values before calling gower_matrix if a different handling is required.
gotcha The categorical column handling is based on pandas object dtype. If your categorical columns are encoded as numbers (e.g., int64), they will be treated as numeric, which may produce incorrect distances. ↓
fix Ensure categorical columns are explicitly of dtype 'object' or 'category' (though category may not be recognized; prefer object dtype).
Imports
- gower_matrix wrong
import gower_matrixcorrectfrom gower import gower_matrix
Quickstart
import numpy as np
import pandas as pd
from gower import gower_matrix
# Example data with mixed types
X = pd.DataFrame({
'num': [1.0, 2.0, 3.0],
'cat': ['a', 'b', 'a']
})
Y = pd.DataFrame({
'num': [2.0, 3.0],
'cat': ['b', 'c']
})
# Compute Gower's distance matrix
D = gower_matrix(X, Y)
print(D)
# Expected: array of shape (len(X), len(Y))