TensorFlow Datasets (Nightly)

4.9.9.dev202510250044 · active · verified Thu Apr 16

tensorflow/datasets is a library of datasets ready to use with TensorFlow. The `tfds-nightly` package provides daily releases, offering the latest features and bug fixes, often before they are available in the stable `tensorflow-datasets` release. It provides a vast collection of datasets for machine learning pipelines, supporting various frameworks beyond TensorFlow, including JAX and PyTorch.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to install `tfds-nightly`, import `tensorflow_datasets`, load a common dataset like MNIST, and iterate over a few examples. It configures a data directory and retrieves dataset information.

import tensorflow_datasets as tfds
import os

# Set TFDS data directory (optional, but good practice for caching)
os.environ['TFDS_DATA_DIR'] = '/tmp/tfds_data'

# Load a dataset (e.g., MNIST)
ds, info = tfds.load(
    'mnist',
    split='train',
    shuffle_files=True,
    as_supervised=True, # Returns (image, label) tuples
    with_info=True
)

print(f"Dataset info: {info.description}")
print(f"Number of training examples: {info.splits['train'].num_examples}")

# Iterate over a few examples
for image, label in ds.take(1):
    print(f"Image shape: {image.shape}, Label: {label}")

view raw JSON →