Kubeflow Pipelines SDK
The Kubeflow Pipelines SDK (kfp), currently at version 2.16.0, is a Python library for building and deploying portable, scalable machine learning workflows based on Docker containers within the Kubeflow project. It allows users to compose multi-step workflows (pipelines) as a graph of containerized tasks using Python code and/or YAML. Releases are frequent, often bundling the SDK with related components like `kfp-pipeline-spec`, `kfp-server-api`, and `kfp-kubernetes`.
Common errors
-
ModuleNotFoundError: No module named 'kfp'
cause The 'kfp' Python package is not installed in the current environment or the environment where the code is being executed.fixInstall the Kubeflow Pipelines SDK using pip: `pip install kfp`. -
AttributeError: module 'kfp.components' has no attribute 'create_component_from_func'
cause This error typically occurs when using KFP v1 API syntax (`kfp.components.create_component_from_func`) with a KFP v2 SDK installation, or when migrating from KFP v1 to v2 without updating component definitions.fixMigrate your component definitions to the KFP v2 style by using the `@kfp.dsl.component` decorator for Python functions instead of `kfp.components.create_component_from_func`. -
TypeError: Input argument supports only the following types: PipelineParam, str, int, float, bool, dict, and list. Got: "None".
cause A component function in a KFP v2 pipeline received an argument with an unsupported type, often a `None` value, where a specific primitive or collection type was expected during pipeline compilation.fixEnsure all inputs passed to KFP v2 components are explicitly typed and are valid KFP parameter types (str, int, float, bool, dict, list). Avoid passing `None` or custom object types directly as component inputs. -
kfp or dsl-compile command not found
cause The `kfp` or `dsl-compile` command-line executables, installed as part of the Kubeflow Pipelines SDK, are not in your system's PATH environment variable.fixIf installed with `--user`, add the Python user base binary directory (e.g., `~/.local/bin` on Linux/macOS) to your PATH environment variable. For example: `export PATH=$PATH:~/.local/bin` in your shell's configuration file (like `~/.bashrc` or `~/.zshrc`), then `source` the file or restart your terminal.
Warnings
- breaking KFP SDK v2 is generally not backward compatible with user code written using the KFP SDK v1 main namespace. Key breaking changes include a new more Pythonic SDK with decorators like `@dsl.pipeline` and `@dsl.component`, and compilation to a generic Intermediate Representation (IR) YAML instead of Argo Workflow YAML.
- breaking As of KFP 2.15.0, the default object store deployment for Kubeflow Pipelines has changed from MinIO to SeaweedFS. While MinIO is still supported, users upgrading from versions prior to 2.15.0 with existing or custom MinIO configurations for their backend may need to adjust their deployment manifests to maintain their desired object store configuration.
- breaking KFP 2.15.0 introduced a major upgrade to the underlying Gorm backend, necessitating an automated database index migration. This migration does not support rollback. It is strongly advised to back up production databases before initiating an upgrade from versions prior to 2.15.0.
- gotcha In KFP 2.15.0, a regression was identified for AWS S3 authentication using IAM Roles for Service Accounts (IRSA). Specifically, the environment variables `OBJECTSTORECONFIG_ACCESSKEY` and `OBJECTSTORECONFIG_SECRETACCESSKEY` (which could previously be empty or omitted when using IRSA) became implicitly required, leading to authentication failures.
- gotcha The default base_image used by the `@dsl.component` decorator will switch from 'python:3.11' to 'python:3.12' on Oct 1, 2027. This change could affect component execution if not explicitly accounted for in future KFP SDK versions.
- gotcha A FutureWarning is emitted for `@dsl.component` indicating that the default `base_image` will change from 'python:3.11' to 'python:3.12' on Oct 1, 2027. This may affect existing components that do not explicitly specify a `base_image` argument.
Install
-
pip install kfp -
pip install kfp[kubernetes]
Imports
- Client
from kfp import Client
- dsl
from kfp.v2 import dsl
from kfp import dsl
- component
from kfp.dsl import component
- pipeline
from kfp.dsl import pipeline
Quickstart
import kfp
from kfp import dsl
import os
# Define a lightweight Python component
@dsl.component
def add(a: float, b: float) -> float:
'''Calculates sum of two arguments'''
return a + b
# Define a pipeline using the component
@dsl.pipeline(
name='Addition pipeline',
description='An example pipeline that performs addition calculations.'
)
def add_pipeline(
a: float = 1.0,
b: float = 7.0,
):
first_add_task = add(a=a, b=4.0)
second_add_task = add(a=first_add_task.output, b=b)
# --- Running the pipeline (requires KFP backend) ---
# In a real environment, you'd configure the KFP client to connect to your KFP instance.
# For local testing without a KFP backend, you can use `kfp.local.init`.
# Example of compiling a pipeline (no KFP backend needed for this step)
# compiler = kfp.compiler.Compiler()
# compiler.compile(pipeline_func=add_pipeline, package_path='add_pipeline.yaml')
# Example of running a pipeline against a KFP endpoint
# client = kfp.Client(host=os.environ.get('KFP_HOST', 'http://localhost:8080'))
# run = client.create_run_from_pipeline_func(
# add_pipeline,
# arguments={'a': 7.0, 'b': 8.0}
# )
# print(f"Pipeline run initiated: {run.url}")
print("Pipeline 'add_pipeline' defined successfully. To run, compile and submit to a KFP backend.")