ppft: Parallel Python Framework
ppft is a friendly fork of Parallel Python (pp), designed to provide distributed and parallel Python capabilities on Symmetric Multi-Processing (SMP) systems and clusters. It offers an easier installation process and enhanced serialization through the optional `dill` library. Currently at version 1.7.8, ppft maintains an active development and release cadence, with recent updates focused on Python 3 compatibility and dropping support for older Python versions.
Common errors
-
import ppft
cause ppft installs itself as the 'pp' module, and core functionalities are accessed via `import pp`. Directly importing from `ppft` is not the intended mechanism for using the library's core features.fixUse `import pp` to access the library's functionalities. For example, `job_server = pp.Server()`. -
ModuleNotFoundError: No module named 'pp'
cause This error occurs when the Python interpreter cannot find the 'pp' module, which is how ppft is imported. This can be due to ppft not being installed, being installed in a different Python environment, or issues with the Python path.fixEnsure ppft is installed in your active Python environment using `pip install ppft`. If you have the original 'Parallel Python' (pp) library installed, uninstall it first (`pip uninstall pp`) to avoid conflicts, then install `ppft`. -
AttributeError: Can't get attribute 'PartialSum' on <module 'ppft.__main__' from '...'>
cause This specific AttributeError indicates a serialization issue, often encountered when using ppft with newer Python versions (e.g., Python 3.13) where there might be incompatibilities with how objects are pickled or unpickled across processes.fixEnsure your Python version meets ppft's requirements (version 1.7.8 requires Python >=3.9; newer versions may require >=3.10). If the issue persists, consider installing ppft with the optional 'dill' dependency (`pip install ppft[dill]`) for enhanced serialization, as dill can handle more complex Python objects than the default pickle module. -
TypeError: cannot pickle 'function' object
cause ppft, like other parallel processing libraries, relies on serialization (pickling) to pass functions and objects between processes. This error occurs when a function or object you are trying to pass for parallel execution cannot be serialized by the default `pickle` module, often due to closures, lambda functions, or objects with complex states.fixInstall ppft with the optional `dill` dependency using `pip install ppft[dill]`. `dill` provides enhanced serialization capabilities that can handle a wider range of Python objects and functions, resolving many pickling issues.
Warnings
- breaking Python 2.x support was dropped in early versions, and Python 3.x minimum requirements have increased incrementally. Version 1.7.8 requires Python >=3.9, and upcoming versions (e.g., 1.7.9.dev0 documentation) indicate a requirement of Python >=3.10. Ensure your Python environment meets the specific version's requirements.
- gotcha ppft installs itself as the 'pp' module. If the original 'Parallel Python' (pp) library is already installed in your environment, 'import pp' may point to the original library instead of the ppft fork. This can lead to unexpected behavior or missing features.
- gotcha For enhanced serialization of complex Python objects (e.g., lambdas, nested functions, objects with custom serialization), it is highly recommended to install ppft with the optional 'dill' dependency. Without it, some objects might fail to serialize correctly for parallel execution.
Install
-
pip install ppft -
pip install ppft[dill]
Imports
- Server
from ppft import Server
import pp job_server = pp.Server()
- pp
import pp
Quickstart
import pp
import math
def my_function(a, b):
return math.sqrt(a**2 + b**2)
# Create a job server
# The number of workers can be specified, e.g., pp.Server(4)
job_server = pp.Server()
# Submit jobs
jobs = []
for i in range(10):
jobs.append(job_server.submit(my_function, (i, i+1), ())) # func, args, modules
# Retrieve results
results = [job() for job in jobs]
print(f"Calculated results: {results}")
# Destroy the job server
job_server.destroy()