Swifter

1.4.0 · active · verified Thu Apr 09

Swifter is a Python package designed to significantly speed up `apply` operations on pandas DataFrames and Series. It achieves this by automatically determining the fastest available method for applying a function, leveraging vectorized pandas operations, Dask for parallel processing, or multi-threading/multi-processing. Swifter integrates directly into pandas objects, offering a seamless way to optimize user-defined functions without extensive code changes. The current version is 1.4.0, released in July 2023, and the project maintains an active development and release cadence.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to apply a custom, computationally intensive function to a pandas Series and DataFrame using `swifter.apply()`. Swifter automatically chooses the most efficient execution backend (vectorized, Dask, or multiprocessing) based on the data size and function complexity.

import pandas as pd
import swifter

# Create a sample DataFrame
df = pd.DataFrame({
    'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'category': ['A', 'B', 'A', 'C', 'B', 'A', 'C', 'B', 'A', 'C']
})

# Define a custom function
def complex_calculation(x):
    import time
    time.sleep(0.001) # Simulate a time-consuming operation
    return x * x + 1

# Use swifter.apply() on a Series
df['squared_value'] = df['value'].swifter.apply(complex_calculation)

# Use swifter.apply() on a DataFrame (row-wise)
df['sum_squared'] = df.swifter.apply(lambda row: row['value']**2 + row['value'], axis=1)

print(df.head())

view raw JSON →