{"id":1738,"library":"swifter","title":"Swifter","description":"Swifter is a Python package designed to significantly speed up `apply` operations on pandas DataFrames and Series. It achieves this by automatically determining the fastest available method for applying a function, leveraging vectorized pandas operations, Dask for parallel processing, or multi-threading/multi-processing. Swifter integrates directly into pandas objects, offering a seamless way to optimize user-defined functions without extensive code changes. The current version is 1.4.0, released in July 2023, and the project maintains an active development and release cadence.","status":"active","version":"1.4.0","language":"en","source_language":"en","source_url":"https://github.com/jmcarpenter2/swifter","tags":["pandas","dataframe","performance","optimization","parallel-processing","dask"],"install":[{"cmd":"pip install swifter","lang":"bash","label":"Install base package"},{"cmd":"pip install -U pandas swifter[notebook]","lang":"bash","label":"Install with notebook progress bar"},{"cmd":"pip install -U swifter[groupby]","lang":"bash","label":"Install with groupby.apply dependencies (includes Ray)"},{"cmd":"conda install -c conda-forge swifter","lang":"bash","label":"Install via Conda"}],"dependencies":[{"reason":"Core dependency for DataFrame and Series operations.","package":"pandas","optional":false},{"reason":"Optional dependency for displaying rich progress bars, especially in Jupyter notebooks. Included with `swifter[notebook]`.","package":"tqdm","optional":true},{"reason":"Optional dependency for distributed computing and parallel processing when functions cannot be vectorized. Used internally by swifter.","package":"dask","optional":true},{"reason":"Used internally by swifter for system resource monitoring and optimization decisions.","package":"psutil","optional":true},{"reason":"Optional dependency for optimized `groupby.apply` functionality, specifically for swifter versions >= 1.3.2. Included with `swifter[groupby]`.","package":"ray","optional":true}],"imports":[{"note":"Imports swifter to add the `.swifter` accessor to pandas objects.","symbol":"swifter","correct":"import swifter"},{"note":"Standard import for pandas functionality, required for swifter integration.","symbol":"pandas","correct":"import pandas as pd"}],"quickstart":{"code":"import pandas as pd\nimport swifter\n\n# Create a sample DataFrame\ndf = pd.DataFrame({\n    'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n    'category': ['A', 'B', 'A', 'C', 'B', 'A', 'C', 'B', 'A', 'C']\n})\n\n# Define a custom function\ndef complex_calculation(x):\n    import time\n    time.sleep(0.001) # Simulate a time-consuming operation\n    return x * x + 1\n\n# Use swifter.apply() on a Series\ndf['squared_value'] = df['value'].swifter.apply(complex_calculation)\n\n# Use swifter.apply() on a DataFrame (row-wise)\ndf['sum_squared'] = df.swifter.apply(lambda row: row['value']**2 + row['value'], axis=1)\n\nprint(df.head())","lang":"python","description":"This quickstart demonstrates how to apply a custom, computationally intensive function to a pandas Series and DataFrame using `swifter.apply()`. Swifter automatically chooses the most efficient execution backend (vectorized, Dask, or multiprocessing) based on the data size and function complexity."},"warnings":[{"fix":"Ensure that functions passed to `swifter.apply()` are pure functions without side effects on external state.","message":"Avoid using swifter with functions that modify external variables. Swifter performs 'sample applies' to optimize performance, which can lead to erroneous modifications of external variables in addition to the final apply operation.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Disable the progress bar using `swifter.disable_progress_bar()` or by setting the `SWIFTER_PROGRESS_BAR` environment variable to `False` when running in forked processes.","message":"When `swifter` is called from a forked process, its progress bar may become confused. It is advisable to disable the progress bar in such scenarios.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Order imports as `import modin.pandas as pd; import swifter` or call `swifter.register_modin()` if `swifter` is imported first.","message":"For compatibility with Modin DataFrames, `modin.pandas` must be imported *before* `swifter`, or `swifter.register_modin()` must be called explicitly after importing both.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always ensure you are using a recent and updated version of pandas (`pip install -U pandas`).","message":"Swifter relies on recent features of the pandas extension API. Older versions of pandas (e.g., pre-1.0) may not be fully compatible or may cause unexpected behavior.","severity":"breaking","affected_versions":"< 1.0.0 (pandas)"},{"fix":"Structure your `apply` operations to be row-wise (`axis=1`) when dealing with large datasets that would trigger Dask usage, or consider alternative Dask-native operations for column-wise transforms.","message":"When using Dask as a backend for large datasets, `swifter` is limited to `axis=1` (row-wise application) for `df.swifter.apply()`. Attempting `axis=0` with large Dask-backed DataFrames may not use Dask or might result in errors.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}