pandas-flavor
pandas-flavor is a Python library that extends Pandas' API by simplifying the process of registering custom methods and accessors directly onto Pandas DataFrames, Series, and GroupBy objects. It makes it easier to add custom functionality, making it backwards compatible with older versions of Pandas. The current version is 0.8.1, and it is actively maintained with a regular release cadence.
Warnings
- gotcha Directly registering methods (e.g., with `register_dataframe_method`) can lead to 'monkey-patching' where custom functions directly modify Pandas objects. While convenient, the Pandas community often prefers namespaced accessors (`register_dataframe_accessor`) to prevent potential conflicts and maintain clarity, especially in larger projects or libraries.
- breaking While `pandas-flavor` aims for backward compatibility with Pandas versions, recent major Pandas updates (e.g., Pandas 3.0) introduce significant breaking changes in Pandas' core behavior (e.g., dedicated string dtype by default, Copy-on-Write). These changes can subtly affect how user-defined `pandas-flavor` methods and accessors operate on data if not accounted for.
- gotcha Registering a method or accessor with a name that already exists on a Pandas DataFrame or Series can lead to unexpected behavior or overwrite existing functionality, although `pandas-flavor` may issue a warning in some cases.
Install
-
pip install pandas-flavor -
conda install -c conda-forge pandas-flavor
Imports
- register_dataframe_method
from pandas_flavor import register_dataframe_method
- register_series_method
from pandas_flavor import register_series_method
- register_dataframe_accessor
from pandas_flavor import register_dataframe_accessor
- register_series_accessor
from pandas_flavor import register_series_accessor
Quickstart
import pandas as pd
import pandas_flavor as pf
@pf.register_dataframe_method
def filter_by_value(df, column, value):
"""Filters a DataFrame to rows where 'column' equals 'value'."""
return df[df[column] == value]
df = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie", "Alice"],
"age": [25, 30, 35, 25]
})
# Now the custom method is available directly on the DataFrame
filtered_df = df.filter_by_value(column="name", value="Alice")
print(filtered_df)