Chalk Python SDK
Chalkpy is the Python SDK for Chalk, a feature store designed to simplify feature engineering and deployment for machine learning teams. It allows users to define feature pipelines using familiar Python functions and data structures, orchestrating them on a Rust-based engine for parallel execution. The library facilitates defining features with Pydantic-inspired classes and creating resolvers to compute them for both online inference and offline training. The current version is 2.115.4, with frequent updates indicated by its changelog.
Warnings
- gotcha Authentication requires `CHALK_CLIENT_ID` and `CHALK_CLIENT_SECRET` environment variables to be set, or prior authentication through the `chalk cli login` command. Failing to do so will result in connection errors.
- gotcha Specific functionalities require installation with 'extras'. For example, `chalkpy[runtime]` is recommended for notebook environments, and `chalkpy[chalkdf]` is needed for features utilizing Chalk DataFrames. A bare `pip install chalkpy` may not include all necessary components for certain use cases.
- gotcha The `cache_nulls` parameter for features defaults to `True`, meaning Chalk will cache all values, including nulls. This can lead to unexpected caching behavior if not explicitly handled, as a null value will replace an existing cached value.
Install
-
pip install chalkpy -
pip install "chalkpy[runtime]" -
pip install "chalkpy[chalkdf]"
Imports
- ChalkClient
from chalk.client import ChalkClient
- features
from chalk import features
- online
from chalk import online
- offline
from chalk import offline
- _
from chalk import _
Quickstart
import os
from chalk.client import ChalkClient
# Ensure these environment variables are set or authenticate via `chalk login` CLI command
CHALK_CLIENT_ID = os.environ.get("CHALK_CLIENT_ID", "")
CHALK_CLIENT_SECRET = os.environ.get("CHALK_CLIENT_SECRET", "")
if not CHALK_CLIENT_ID or not CHALK_CLIENT_SECRET:
print("Warning: CHALK_CLIENT_ID and CHALK_CLIENT_SECRET environment variables are not set. Authentication will likely fail. Please refer to Chalk documentation for authentication.")
try:
client = ChalkClient(
client_id=CHALK_CLIENT_ID,
client_secret=CHALK_CLIENT_SECRET,
branch='notebook' # or 'production', or a custom branch name
)
# Verify the setup
whoami_response = client.whoami()
print(f"Successfully authenticated as user ID: {whoami_response.user_id}")
except Exception as e:
print(f"Authentication failed or API call error: {e}")