Prince - Factor Analysis in Python
Prince is a Python library for various factor analysis methods, including Principal Component Analysis (PCA), Correspondence Analysis (CA), Multiple Correspondence Analysis (MCA), Multiple Factor Analysis (MFA), Factor Analysis of Mixed Data (FAMD), Generalized Procrustes Analysis (GPA), and Procrustes Global Analysis (PGA). As of version 0.17.0, it offers a scikit-learn compatible API, making it easy to integrate into existing data science workflows. The project is actively maintained with a relatively steady release cadence, incorporating new features and improvements.
Common errors
-
AttributeError: 'PCA' object has no attribute 'explained_variance_ratio_'
cause The attribute name for explained variance ratio changed from `explained_variance_ratio_` to `explained_inertia_` in version 0.10.0.fixUpdate your code to use `pca.explained_inertia_` instead. -
ValueError: Input data contains non-numeric values.
cause Methods like PCA, CA, and MFA expect purely numerical input data. You might be passing a DataFrame with categorical columns without prior encoding.fixEnsure all input features are numerical. For categorical data, use one-hot encoding or label encoding, or consider using `prince.FAMD` which is designed for mixed data types. -
TypeError: fit() missing 1 required positional argument: 'X'
cause You called the `fit()` method without providing the input data (feature matrix).fixAlways pass your input data (e.g., a Pandas DataFrame or NumPy array) as the `X` argument: `model.fit(X)`.
Warnings
- breaking The attribute `explained_variance_ratio_` was renamed to `explained_inertia_` to better align with factor analysis terminology.
- breaking The `weight_col` parameter in `CA`, `MCA`, and `MFA` models was removed.
- gotcha Unlike some other PCA implementations (e.g., `sklearn.decomposition.PCA`), `prince.PCA` only centers the data by default, but does not standardize (scale to unit variance).
Install
-
pip install prince
Imports
- PCA
from prince import PCA
- MCA
from prince import MCA
- CA
from prince import CA
- MFA
from prince import MFA
- FAMD
from prince import FAMD
Quickstart
import pandas as pd
from prince import PCA
# Sample data for PCA
X = pd.DataFrame({
'feature_a': [1, 2, 3, 4, 5],
'feature_b': [2, 3, 4, 5, 6],
'feature_c': [3, 4, 5, 6, 7]
})
# Initialize and fit PCA model
pca = PCA(n_components=2)
pca.fit(X)
# Transform the data
X_transformed = pca.transform(X)
print("Original data head:\n", X.head())
print("\nTransformed data head (2 components):\n", X_transformed.head())
print("\nExplained inertia per component:", pca.explained_inertia_)