KaggleHub Library
KaggleHub is a Python library that provides a unified interface to programmatically access and download Kaggle resources, primarily models and datasets, outside of the Kaggle platform. It aims to standardize resource paths and simplify integration with other ML frameworks. The current version is 1.0.0, and the library has a frequent release cadence, with minor versions often released every few weeks.
Warnings
- gotcha KaggleHub (this library) is distinct from the general 'kaggle' API client. While both interact with Kaggle, KaggleHub focuses specifically on standardizing the download and access of models and datasets, often integrating with other ML frameworks. Users familiar with `kaggle.api` for broader tasks (competitions, submitting, etc.) should note the different purpose and API.
- gotcha Resource handles for models and datasets follow a specific `owner/model-name/framework/variation/version` (for models) or `owner/dataset-slug` (for datasets) format. Incorrectly formatted handles will lead to download failures.
- gotcha Authentication requires Kaggle API credentials. This typically means setting `KAGGLE_USERNAME` and `KAGGLE_KEY` environment variables or having a valid `kaggle.json` file in `~/.kaggle/`.
Install
-
pip install kagglehub
Imports
- model_download
from kagglehub import model_download
- dataset_download
from kagglehub import dataset_download
- hf_model_download
from kagglehub import hf_model_download
Quickstart
import os
from kagglehub import model_download
# Ensure KAGGLE_USERNAME and KAGGLE_KEY environment variables are set
# or a kaggle.json file exists in ~/.kaggle/
# Example: Download a specific version of a model
model_handle = 'google/vit/tensorflow/vit-base-patch16-224-fe/2'
model_path = model_download(model_handle)
print(f"Downloaded model path: {model_path}")
# To run this, you need to have Kaggle API credentials configured.
# For local testing, ensure your `KAGGLE_USERNAME` and `KAGGLE_KEY`
# environment variables are set or a `kaggle.json` file is present.