{"id":8712,"library":"tfds-nightly","title":"TensorFlow Datasets (Nightly)","description":"tensorflow/datasets is a library of datasets ready to use with TensorFlow. The `tfds-nightly` package provides daily releases, offering the latest features and bug fixes, often before they are available in the stable `tensorflow-datasets` release. It provides a vast collection of datasets for machine learning pipelines, supporting various frameworks beyond TensorFlow, including JAX and PyTorch.","status":"active","version":"4.9.9.dev202510250044","language":"en","source_language":"en","source_url":"https://github.com/tensorflow/datasets","tags":["tensorflow","datasets","machine-learning","nightly","data-preparation"],"install":[{"cmd":"pip install tfds-nightly","lang":"bash","label":"Latest nightly release"},{"cmd":"pip install tfds-nightly tensorflow matplotlib","lang":"bash","label":"Including common ML dependencies"}],"dependencies":[{"reason":"While TFDS can be used framework-agnostically and in a 'TensorFlow-less' manner for reading, full functionality (especially for building datasets or using `tf.data.Dataset` objects) often relies on TensorFlow.","package":"tensorflow","optional":true},{"reason":"Required for generating large datasets in a distributed manner, particularly when using the `tfds build` CLI for Beam-based datasets.","package":"apache-beam","optional":true},{"reason":"Needed for the 'TensorFlow-less' data loading path, providing efficient random access to dataset records.","package":"array_record","optional":true}],"imports":[{"symbol":"tfds","correct":"import tensorflow_datasets as tfds"}],"quickstart":{"code":"import tensorflow_datasets as tfds\nimport os\n\n# Set TFDS data directory (optional, but good practice for caching)\nos.environ['TFDS_DATA_DIR'] = '/tmp/tfds_data'\n\n# Load a dataset (e.g., MNIST)\nds, info = tfds.load(\n    'mnist',\n    split='train',\n    shuffle_files=True,\n    as_supervised=True, # Returns (image, label) tuples\n    with_info=True\n)\n\nprint(f\"Dataset info: {info.description}\")\nprint(f\"Number of training examples: {info.splits['train'].num_examples}\")\n\n# Iterate over a few examples\nfor image, label in ds.take(1):\n    print(f\"Image shape: {image.shape}, Label: {label}\")","lang":"python","description":"This quickstart demonstrates how to install `tfds-nightly`, import `tensorflow_datasets`, load a common dataset like MNIST, and iterate over a few examples. It configures a data directory and retrieves dataset information."},"warnings":[{"fix":"Regularly check the GitHub changelog and be prepared to update code to match API changes. Pin `tfds-nightly` to a specific nightly version if stability is crucial, but acknowledge that this defeats the purpose of 'nightly'.","message":"Nightly builds often include API changes and experimental features that may be unstable or subject to further modification before a stable release. Code written against a `tfds-nightly` version might break with subsequent nightly updates.","severity":"breaking","affected_versions":"All nightly versions"},{"fix":"Explicitly handle `None` values during dataset creation or consumption, e.g., by filtering, mapping, or using `tfds.features.Optional` if applicable, rather than relying on implicit defaults.","message":"The handling of `None` values when processing Hugging Face datasets (e.g., via `HuggingfaceDatasetBuilder`) changed from defaulting to `0`/`0.0` for int/float features to using NumPy's `-inf`. This can silently alter data or cause downstream errors if your code expected the old default behavior for missing values.","severity":"breaking","affected_versions":"v4.9.3 and later nightly builds"},{"fix":"Manually install `apache-beam` if you plan to use `tfds build` or other Beam-dependent functionalities: `pip install apache-beam`. Check the `tfds` changelog or documentation for any specific version requirements for `apache-beam`.","message":"Using `tfds build` for Beam-based datasets requires `apache-beam` to be installed, but `tfds-nightly` does not always automatically install it as a direct dependency. There have also been specific `apache-beam` version compatibility pins (`<2.65.0` in v4.9.9) that might cause issues.","severity":"gotcha","affected_versions":"v4.9.9 and potentially other versions when using Beam"},{"fix":"Ensure you are on the latest `tfds-nightly` version. If encountering `ModuleNotFoundError` for platform-specific modules on non-Linux OS, check relevant GitHub issues for patches or workarounds.","message":"There was a bug where the `resource` module, which is not available on Windows, caused a `ModuleNotFoundError` when importing `tensorflow_datasets`. While fixed in later versions, similar platform-specific dependency issues can arise in nightly builds.","severity":"gotcha","affected_versions":"Around v4.9.2-v4.9.3 (fixed in subsequent patches)"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the package: `pip install tfds-nightly` (or `pip install tensorflow-datasets` for stable). If using a virtual environment, ensure it is activated.","cause":"The `tensorflow-datasets` or `tfds-nightly` package is not installed in the current Python environment, or the environment is not correctly activated.","error":"ModuleNotFoundError: No module named 'tensorflow_datasets'"},{"fix":"Inspect your dataset and data processing pipeline for `None` values. For Hugging Face datasets, be aware of the v4.9.3 change in `None` handling. Filter out `None`s, provide explicit default values, or use `tfds.features.Optional` where appropriate.","cause":"This error can occur when TensorFlow operations encounter `None` values in tensors where they are not expected, particularly after changes in `tfds`'s handling of missing data or if input data contains unexpected `None`s.","error":"ValueError: None values not supported."},{"fix":"Delete the corrupted file from the `downloads` folder and try again. If the upstream data has genuinely changed, the dataset builder needs to be updated. For custom datasets, use `tfds build --register_checksums` to update the checksum.","cause":"The downloaded file (or a file on the local disk) does not match the expected checksum, indicating a potential corruption, an update to the source data, or a local file system issue.","error":"NonMatchingChecksumError: Checksum mismatch for downloaded file..."},{"fix":"Ensure your environment is clean and all dependencies are up-to-date (`pip install --upgrade tfds-nightly`). If the problem persists, consult the TensorFlow Datasets GitHub issues for specific workarounds or related bugs.","cause":"This is an error that can occur when `tfds build` is run, potentially related to file system access or an internal path resolution issue within the library.","error":"TypeError: Unknown resource path: : MultiplexedPath"}]}