Azure Machine Learning Feature Store SDK
The `azureml-featurestore` package is the core SDK interface for Azure ML Feature Store, working alongside `azure-ai-ml` to provide a managed feature store experience. It enables the development of feature set specifications in Spark, listing and retrieving feature sets, generating and resolving feature retrieval specifications, and performing offline feature retrieval with point-in-time joins. The library is actively developed, with its current version being 1.2.2.
Warnings
- breaking The online feature retrieval functions `init_online_lookup`, `shutdown_online_lookup`, and `get_online_features` were moved from being methods of `FeatureStoreClient` to standalone module-level functions. Additionally, the contract for `get_online_features` changed from accepting/returning `pandas.DataFrame` to `pyarrow.Table`.
- gotcha Many operations with Azure ML Feature Store, including creating the feature store itself, retrieving features, and materialization jobs, require specific Azure Role-Based Access Control (RBAC) permissions. Common errors arise from insufficient permissions.
- gotcha Updating an offline materialization store at the feature store level requires disabling offline materialization for all feature sets within that store first. Disabling materialization resets the status of already materialized data, rendering it unusable.
Install
-
pip install azureml-featurestore azure-ai-ml mltable
Imports
- FeatureStoreClient
from azureml.featurestore import FeatureStoreClient
- init_online_lookup
from azureml.featurestore import init_online_lookup
- shutdown_online_lookup
from azureml.featurestore import shutdown_online_lookup
- get_online_features
from azureml.featurestore import get_online_features
Quickstart
import os
from azure.ai.ml import MLClient
from azure.ai.ml.entities import FeatureStore, FeatureStoreEntity
from azure.identity import DefaultAzureCredential
from azureml.featurestore import FeatureStoreClient
# Replace with your actual subscription, resource group, and feature store name
subscription_id = os.environ.get('AZURE_SUBSCRIPTION_ID', 'your-subscription-id')
resource_group_name = os.environ.get('AZURE_RESOURCE_GROUP', 'your-resource-group')
feature_store_name = os.environ.get('AZURE_FEATURE_STORE_NAME', 'my-feature-store')
feature_store_location = os.environ.get('AZURE_LOCATION', 'eastus')
# Authenticate and create MLClient for managing Azure ML resources
try:
credential = DefaultAzureCredential()
ml_client = MLClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name
)
except Exception as e:
print(f"Could not authenticate or create MLClient: {e}")
print("Please ensure AZURE_SUBSCRIPTION_ID, AZURE_RESOURCE_GROUP, and AZURE_LOCATION environment variables are set or replaced.")
exit(1)
# Create a Feature Store if it doesn't exist (this step typically requires Azure CLI or a separate script to run once)
# For a runnable quickstart, we assume the feature store might already be created or we just define it.
# In a real scenario, you'd check for existence and create if necessary.
print(f"Attempting to get or define Feature Store '{feature_store_name}'...")
feature_store = FeatureStore(
name=feature_store_name,
location=feature_store_location,
description="My first Azure ML Feature Store"
)
# This line would typically be ml_client.feature_stores.begin_create_or_update(feature_store).result()
# However, for a quickstart that focuses on the azureml-featurestore SDK, we will mock the client initialization
# assuming the feature store exists.
print("Assuming feature store is created. Initializing FeatureStoreClient...")
# Initialize FeatureStoreClient
# This client is used for developing and consuming features.
fs_client = FeatureStoreClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name,
name=feature_store_name
)
print(f"Successfully initialized FeatureStoreClient for '{feature_store_name}'.")
# Example: List existing feature sets (requires a feature store to be set up and contain feature sets)
# In a real scenario, you would have feature sets defined and registered.
# This is a placeholder to demonstrate client usage.
print("Attempting to list feature sets (this might return empty if no feature sets exist)...")
feature_sets = fs_client.feature_sets.list()
for fs in feature_sets:
print(f" - Found Feature Set: {fs.name} (Version: {fs.version})")
print("Quickstart finished. For advanced usage, refer to Azure ML Feature Store documentation.")