KServe Python SDK
The KServe Python SDK provides a client library for interacting with KServe (formerly KFServing) on Kubernetes. It allows users to define, deploy, and manage machine learning inference services programmatically. The current version is 0.17.0. Releases are typically aligned with the main KServe project, with new versions dropping every few months.
Common errors
-
ModuleNotFoundError: No module named 'kfserving'
cause You are trying to import from the old, deprecated 'kfserving' package.fixUninstall the old `kfserving` package and install the new `kserve` package: `pip uninstall kfserving; pip install kserve`. Then update your import statements to `from kserve import ...`. -
kubernetes.config.config_exception.ConfigException: Cannot load kube-config from any of: [...]
cause The Kubernetes client cannot find or access a valid Kubernetes configuration file (e.g., `~/.kube/config`) to connect to a cluster.fixEnsure your `kubectl` is configured and can connect to your cluster. If running inside a cluster, use `k8s_client.config.load_incluster_config()` (often handled by the SDK implicitly or in the quickstart example). -
AttributeError: 'KServeClient' object has no attribute 'create_isvc'
cause You are using an older API method that has been replaced or renamed in newer KServe SDK versions.fixCheck the latest KServe Python SDK documentation for the correct method names. For example, `create_isvc` was replaced by a more generic `create(isvc_obj)` method.
Warnings
- breaking The project was renamed from KFServing to KServe. The Python package `kfserving` is deprecated and no longer maintained. Users must migrate to the `kserve` package.
- gotcha The KServe Python SDK relies heavily on the official `kubernetes` Python client to interact with your cluster. This dependency is not automatically installed with `pip install kserve`.
- gotcha KServe's underlying Kubernetes API objects (like InferenceService) are evolving. While `v1beta1` is still widely used and supported by the SDK, future versions of KServe may promote `v1` as the primary API.
Install
-
pip install kserve kubernetes
Imports
- KServeClient
from kserve import KServeClient
- V1beta1InferenceService
from kfserving.models import V1beta1InferenceService
from kserve import V1beta1InferenceService
- constants
from kserve import constants
- client
from kubernetes import client as k8s_client
Quickstart
import os
from kubernetes import client as k8s_client
from kserve import KServeClient, constants, utils
from kserve import V1beta1InferenceService, V1beta1InferenceServiceSpec, V1beta1PredictorSpec, V1beta1SKLearnSpec
# --- Configuration and Client Initialization ---
# This example assumes kubectl is configured to connect to a Kubernetes cluster.
# For in-cluster execution, uncomment `k8s_client.config.load_incluster_config()`.
# For local execution, ensure your ~/.kube/config is set up.
try:
k8s_client.config.load_kube_config()
except k8s_client.config.config_exception.ConfigException:
print("Warning: Could not load kube-config. Attempting in-cluster config.")
try:
k8s_client.config.load_incluster_config()
except k8s_client.config.config_exception.ConfigException:
print("Error: Could not load any Kubernetes config. Please ensure kubectl is configured or run within a cluster.")
exit(1)
api_version = constants.KSERVE_API_VERSION
kserve_client = KServeClient()
namespace = os.environ.get('K8S_NAMESPACE', 'default') # Use an environment variable or default
service_name = 'sklearn-iris-quickstart'
# --- Define an InferenceService ---
isvc = V1beta1InferenceService(
api_version=api_version,
kind=constants.KSERVE_KIND,
metadata=k8s_client.V1ObjectMeta(
name=service_name, namespace=namespace
),
spec=V1beta1InferenceServiceSpec(
predictor=V1beta1PredictorSpec(
sklearn=V1beta1SKLearnSpec(
storage_uri='gs://kfserving-examples/models/sklearn/iris',
protocol_version='v1'
)
)
)
)
print(f"Creating InferenceService '{service_name}' in namespace '{namespace}'...")
# --- Create and Wait for InferenceService ---
try:
kserve_client.create(isvc)
print(f"InferenceService '{service_name}' created. Waiting for it to be ready...")
kserve_client.wait_isvc_ready(service_name, namespace=namespace)
print(f"InferenceService '{service_name}' is ready:")
print(kserve_client.get(service_name, namespace=namespace))
# Example of how to delete the service:
# kserve_client.delete(service_name, namespace=namespace)
# print(f"InferenceService '{service_name}' deleted.")
except Exception as e:
print(f"Failed to create or wait for InferenceService: {e}")
# Attempt cleanup if creation partially succeeded but failed later
try:
kserve_client.delete(service_name, namespace=namespace)
print(f"Attempted cleanup of '{service_name}'.")
except Exception as cleanup_e:
print(f"Failed to clean up '{service_name}': {cleanup_e}")