Kaggle CLI
The `kaggle` Python library, also known as Kaggle CLI, provides a command-line interface and a Python API to interact with Kaggle resources such as competitions, datasets, models, and notebooks. It enables programmatic listing, downloading, creating, updating, and deleting of these resources. The current version is 2.0.1, with releases happening periodically based on feature additions and bug fixes, ensuring access to the latest Kaggle platform features.
Warnings
- gotcha Authentication failures are the most common issue. Ensure your Kaggle API credentials (`kaggle.json`) are correctly placed in `~/.kaggle/` or that `KAGGLE_USERNAME` and `KAGGLE_KEY` environment variables are properly set. The library automatically looks for these locations/variables.
- gotcha Kaggle enforces dynamic rate limits on API calls. Excessive or rapid requests can lead to HTTP 429 ('Too Many Requests') errors. This is particularly relevant for automated scripts.
- gotcha The library's behavior, especially regarding authentication and resource caching, can differ when run inside a Kaggle Notebook environment compared to a local machine. For instance, `kagglehub` (a related library) is authenticated by default in Kaggle notebooks, but the `kaggle` CLI requires explicit local setup.
- breaking The `kaggle` library (Kaggle CLI) now requires Python 3.11 or newer. Older Python versions (e.g., 3.10 and below) are not supported and will lead to installation or runtime errors.
Install
-
pip install kaggle
Imports
- KaggleApi
from kaggle.api.kaggle_api_extended import KaggleApi
Quickstart
import os
from kaggle.api.kaggle_api_extended import KaggleApi
# --- Authentication ---
# Option 1 (Recommended for local dev): Place kaggle.json in ~/.kaggle/
# (Download from Kaggle profile settings: Account -> 'Create New API Token')
# Option 2: Set environment variables (e.g., in your shell or .env file)
# export KAGGLE_USERNAME='your_username'
# export KAGGLE_KEY='your_api_key'
# Initialize the API client
api = KaggleApi()
api.authenticate() # This will automatically load credentials
# --- Example: List competitions ---
print('Listing recent competitions:')
# Use 'recentlyCreated' for newer competitions, or 'recentlyUpdated' etc.
competitions = api.competitions_list(sort_by='recentlyCreated', page_size=5)
for comp in competitions:
print(f"- {comp.title} (ID: {comp.id})")
# --- Example: Download a public dataset ---
# Replace 'dataset-owner/dataset-name' with the actual dataset reference.
# For example: 'lakshmi25npathi/sentiment-analysis-on-movie-reviews'
# Make sure you have permission to download the dataset (some require acceptance of rules).
# try:
# print(f"\nDownloading dataset...")
# api.dataset_download_files(
# 'lakshmi25npathi/sentiment-analysis-on-movie-reviews',
# path='./data',
# unzip=True
# )
# print("Dataset downloaded to ./data and unzipped.")
# except Exception as e:
# print(f"Error downloading dataset: {e}")