KServe Python SDK

0.17.0 · active · verified Fri Apr 17

The KServe Python SDK provides a client library for interacting with KServe (formerly KFServing) on Kubernetes. It allows users to define, deploy, and manage machine learning inference services programmatically. The current version is 0.17.0. Releases are typically aligned with the main KServe project, with new versions dropping every few months.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the KServe client, define an InferenceService for a scikit-learn model, and deploy it to a Kubernetes cluster. It requires the `kubernetes` package and a configured `kubectl` context or running inside a Kubernetes pod.

import os
from kubernetes import client as k8s_client
from kserve import KServeClient, constants, utils
from kserve import V1beta1InferenceService, V1beta1InferenceServiceSpec, V1beta1PredictorSpec, V1beta1SKLearnSpec

# --- Configuration and Client Initialization ---
# This example assumes kubectl is configured to connect to a Kubernetes cluster.
# For in-cluster execution, uncomment `k8s_client.config.load_incluster_config()`.
# For local execution, ensure your ~/.kube/config is set up.
try:
    k8s_client.config.load_kube_config()
except k8s_client.config.config_exception.ConfigException:
    print("Warning: Could not load kube-config. Attempting in-cluster config.")
    try:
        k8s_client.config.load_incluster_config()
    except k8s_client.config.config_exception.ConfigException:
        print("Error: Could not load any Kubernetes config. Please ensure kubectl is configured or run within a cluster.")
        exit(1)

api_version = constants.KSERVE_API_VERSION
kserve_client = KServeClient()

namespace = os.environ.get('K8S_NAMESPACE', 'default') # Use an environment variable or default
service_name = 'sklearn-iris-quickstart'

# --- Define an InferenceService ---
isvc = V1beta1InferenceService(
    api_version=api_version,
    kind=constants.KSERVE_KIND,
    metadata=k8s_client.V1ObjectMeta(
        name=service_name, namespace=namespace
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            sklearn=V1beta1SKLearnSpec(
                storage_uri='gs://kfserving-examples/models/sklearn/iris',
                protocol_version='v1'
            )
        )
    )
)

print(f"Creating InferenceService '{service_name}' in namespace '{namespace}'...")
# --- Create and Wait for InferenceService ---
try:
    kserve_client.create(isvc)
    print(f"InferenceService '{service_name}' created. Waiting for it to be ready...")
    kserve_client.wait_isvc_ready(service_name, namespace=namespace)
    print(f"InferenceService '{service_name}' is ready:")
    print(kserve_client.get(service_name, namespace=namespace))
    # Example of how to delete the service:
    # kserve_client.delete(service_name, namespace=namespace)
    # print(f"InferenceService '{service_name}' deleted.")
except Exception as e:
    print(f"Failed to create or wait for InferenceService: {e}")
    # Attempt cleanup if creation partially succeeded but failed later
    try:
        kserve_client.delete(service_name, namespace=namespace)
        print(f"Attempted cleanup of '{service_name}'.")
    except Exception as cleanup_e:
        print(f"Failed to clean up '{service_name}': {cleanup_e}")

view raw JSON →