Databricks Feature Store Client
The `databricks-feature-store` library provides a Python client for interacting with the Databricks Feature Store. It enables data scientists and ML engineers to create, manage, and discover features for machine learning models within the Databricks platform. The current version is 0.17.0. Its release cadence is typically aligned with Databricks Runtime updates, although specific release notes for this client library are often integrated into Databricks documentation.
Warnings
- gotcha The `databricks-feature-store` client is designed to operate primarily within a Databricks Runtime for Machine Learning environment. Full functionality (e.g., creating, reading, writing feature tables) is dependent on an active Spark session managed by Databricks.
- gotcha Most core operations of the Feature Store client that interact with data (e.g., `create_feature_table`, `write_table`, `read_table`) require an active Apache Spark session (e.g., `spark` variable available). Without it, these methods will raise errors.
- breaking As a `0.x.y` version library, the API is subject to change in minor releases. Breaking changes might occur without a major version increment, requiring updates to existing code.
- gotcha Installing `databricks-feature-store` locally does not automatically install `pyspark`. Attempts to use DataFrame-related functionalities will result in `ModuleNotFoundError` or similar errors if `pyspark` is not separately installed.
Install
-
pip install databricks-feature-store
Imports
- FeatureStoreClient
from databricks.feature_store import FeatureStoreClient
Quickstart
import os
# pyspark.sql is often needed for operations using Spark DataFrames
# from pyspark.sql import SparkSession
from databricks.feature_store import FeatureStoreClient
# NOTE: This client is designed to run primarily within a Databricks Runtime environment.
# Running it locally typically requires an active Spark session and potentially
# Databricks SDK configuration for authentication.
try:
# Initialize the FeatureStoreClient.
# In a Databricks notebook, this usually works without arguments.
# For local testing, it may require a configured Databricks SDK client or environment variables.
fs = FeatureStoreClient()
print("FeatureStoreClient initialized successfully.")
print("Full functionality (e.g., creating/reading feature tables) requires a Spark session and Databricks connectivity.")
# Example of a minimal operation (will likely fail if not in Databricks Runtime/Spark env)
# if os.environ.get("DATABRICKS_RUNTIME_VERSION"):
# # This block would execute if running within Databricks
# print("Running within Databricks Runtime. Attempting to list feature tables...")
# # This requires a SparkSession, usually 'spark' is pre-initialized in DB Runtime
# # try:
# # # To truly run this, you'd need 'spark' object which comes from pyspark
# # # If running locally, you'd need to init SparkSession manually.
# # # E.g., spark = SparkSession.builder.appName("local-fs").getOrCreate()
# # # print(f"Number of feature tables: {len(fs.list_tables())}")
# # except Exception as e:
# # print(f"Could not list tables: {e}")
# else:
# print("Skipping full Feature Store operations: Not detected in Databricks Runtime.")
except Exception as e:
print(f"Error initializing FeatureStoreClient: {e}")
print("Please ensure you are in a Databricks Runtime or have a Spark session and Databricks SDK configured for full functionality.")