PM4Py - Process Mining for Python

raw JSON →
2.7.22.2 verified Fri May 01 auth: no python

PM4Py is an open-source process mining library for Python. It supports process discovery, conformance checking, enhancement, and analysis of event logs. The current version is 2.7.22.2, with an active development cadence of about monthly releases.

pip install pm4py
error AttributeError: module 'pm4py' has no attribute 'discover_petri_net_alpha'
cause The Alpha Miner is not available in the simplified interface of PM4Py 2.x.
fix
Use the legacy API: from pm4py.algo.discovery.alpha import algorithm as alpha_miner net, im, fm = alpha_miner.apply(log)
error pm4py.format_dataframe() KeyError: 'case:concept:name'
cause The required column 'case:concept:name' does not exist in the DataFrame. PM4Py expects column names to be exactly 'case:concept:name', 'concept:name', and 'time:timestamp'.
fix
Ensure your CSV has columns named 'case:concept:name', 'concept:name', and 'time:timestamp', or rename them before calling format_dataframe.
gotcha PM4Py 2.x has a completely different API from PM4Py 1.x. Many old tutorials and examples are for 1.x and will not work.
fix Use the simplified interface introduced in 2.x: functions like pm4py.discover_petri_net_inductive(), pm4py.conformance_token_based_replay(), etc. Check the official documentation for 2.x.
deprecated Functions like pm4py.read_xes(), pm4py.read_csv(), and pm4py.discover_heuristics_net() are deprecated. Use their simplified interface equivalents that return pm4py objects directly.
fix Use pm4py.read.read_xes(file_path) or pm4py.read_csv(csv_path) and use pm4py.discover_heuristics_petri_net() instead.
breaking Polars support is experimental and APIs may change. Using Polars dataframes with format_dataframe may not be fully stable.
fix Convert Polars dataframes to Pandas using .to_pandas() before passing to PM4Py functions unless you are using new Polars-specific methods.

Basic process discovery from a CSV event log using the Inductive Miner.

import pandas as pd
import pm4py

df = pd.read_csv('event_log.csv')
eq = pm4py.format_dataframe(df, case_id='case:concept:name', activity_key='concept:name', timestamp_key='time:timestamp')
net, im, fm = pm4py.discover_petri_net_inductive(eq)
pm4py.view_petri_net(net, im, fm)