Joblib: Lightweight pipelining with Python functions

raw JSON →
1.5.3 verified Tue May 12 auth: no python install: verified quickstart: stale

Joblib is a set of tools for lightweight pipelining in Python, providing transparent disk-caching of functions and easy parallel computing. Current version: 1.5.3. Release cadence: Regular updates with recent releases in 2025 and 2026.

pip install joblib
error ImportError: cannot import name 'joblib' from 'sklearn.externals'
cause This error occurs in newer versions of scikit-learn (0.21.0 and later) because `joblib` was decoupled from `sklearn.externals` and is now an independent package.
fix
Install joblib separately if you haven't already (pip install joblib) and update your import statement to import joblib instead of from sklearn.externals import joblib.
error FileNotFoundError: [Errno 2] No such file or directory: 'your_model.joblib'
cause `joblib.load()` cannot find the specified file because the file path is incorrect, the file does not exist at that location, or the script's current working directory is not what is expected.
fix
Ensure the file path provided to joblib.load() is correct and that the file exists. Use absolute paths, os.path.join for robust path construction, or adjust the script's working directory.
error AttributeError: Can't get attribute 'your_function_or_class' on <module '__main__' (built-in)>
cause This error typically arises when attempting to load a `joblib` file that contains references to custom classes or functions defined within the `__main__` module of the script that saved the object, and these definitions are not available or properly imported in the environment where the file is being loaded. This often happens when functions or classes are defined interactively or not in a proper module.
fix
Ensure that any custom classes or functions referenced in the saved object are defined in a separate Python module (e.g., my_module.py) and that this module is imported in both the saving and loading scripts. The module containing the definitions must be importable in the loading environment.
error ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking.
cause When using `joblib.Parallel` on systems that do not support forking (like Windows), or when running scripts directly without the necessary protection, the main script can be recursively re-imported, leading to this error.
fix
Wrap the main execution logic that uses joblib.Parallel within an if __name__ == '__main__': block.
error BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
cause This error occurs during parallel processing when the objects (arguments or return values) being sent to or from worker processes cannot be serialized (pickled). This can happen due to non-picklable object types, version mismatches between `joblib` or Python itself, or issues with third-party libraries.
fix
Ensure all objects passed to delayed functions and returned by them are picklable. Check for version compatibility between joblib, Python, and any other libraries involved. Sometimes, updating cloudpickle (a dependency of joblib) or ensuring consistent Python versions can resolve this.
breaking Joblib 1.5.3 introduces changes to the Memory class that may affect existing cache directories.
fix Review and update cache directory configurations to align with new Memory class behavior.
gotcha Using joblib.load() on untrusted sources can execute arbitrary code, posing security risks.
fix Avoid loading objects from untrusted sources to maintain security.
breaking ModuleNotFoundError: No module named 'numpy' indicates that the 'numpy' package is not installed in the environment.
fix Install the 'numpy' package using pip: `pip install numpy`.
breaking ModuleNotFoundError: No module named 'numpy'. The numpy package is missing from the environment, causing script execution to fail.
fix Ensure numpy is installed in the environment by adding 'pip install numpy' to the setup process or requirements.
python os / libc status wheel install import disk
3.10 alpine (musl) - - 0.34s 19.9M
3.10 slim (glibc) - - 0.23s 20M
3.11 alpine (musl) - - 0.46s 22.2M
3.11 slim (glibc) - - 0.39s 23M
3.12 alpine (musl) - - 0.70s 14.0M
3.12 slim (glibc) - - 0.63s 14M
3.13 alpine (musl) - - 0.71s 13.6M
3.13 slim (glibc) - - 0.64s 14M
3.9 alpine (musl) - - 0.33s 19.4M
3.9 slim (glibc) - - 0.28s 20M

Example of using Joblib's Memory class for caching a function's output.

from joblib import Memory

# Set up a cache directory
location = 'your_cache_dir'
mem = Memory(location, verbose=1)

# Define a function to cache
import numpy as np

def square(x):
    return np.square(x)

# Cache the function
cached_square = mem.cache(square)

# Use the cached function
result = cached_square(np.array([1, 2, 3]))
print(result)