{"id":1838,"library":"h5netcdf","title":"h5netcdf: NetCDF4 via h5py","description":"h5netcdf is an open-source Python package that provides an interface for the netCDF4 file-format, reading and writing local or remote HDF5 files directly via h5py or h5pyd. It aims to offer netCDF4 capabilities without relying on the Unidata netCDF C library. The current version is 1.8.1, and it maintains a regular release cadence, with recent patch and minor releases occurring every few months.","status":"active","version":"1.8.1","language":"en","source_language":"en","source_url":"https://github.com/h5netcdf/h5netcdf","tags":["netcdf","hdf5","h5py","data science","scientific computing","file I/O","xarray-backend"],"install":[{"cmd":"pip install h5netcdf","lang":"bash","label":"Basic install (requires h5py to be installed separately for versions >= 1.8.0)"},{"cmd":"pip install h5netcdf[h5py]","lang":"bash","label":"Install with h5py backend (recommended)"},{"cmd":"pip install h5netcdf[pyfive]","lang":"bash","label":"Install with pyfive pure-Python backend"},{"cmd":"pip install h5netcdf[h5pyd]","lang":"bash","label":"Install with h5pyd backend for HDF5 REST API"}],"dependencies":[{"reason":"Requires Python 3.9 or newer.","package":"python","optional":false},{"reason":"Primary backend for HDF5 I/O. Required for full functionality and commonly installed via `h5netcdf[h5py]` extra.","package":"h5py","optional":false},{"reason":"Fundamental package for numerical computing, used for data arrays.","package":"numpy","optional":false},{"reason":"Optional pure-Python HDF5 reading backend, installed via `h5netcdf[pyfive]` extra.","package":"pyfive","optional":true},{"reason":"Optional backend for HDF5 REST API, installed via `h5netcdf[h5pyd]` extra.","package":"h5pyd","optional":true}],"imports":[{"note":"Main class for interacting with NetCDF4 files in the new API.","symbol":"File","correct":"from h5netcdf import File"},{"note":"Entry point for the legacy API, designed for netCDF4-python users.","symbol":"Dataset","correct":"from h5netcdf.legacyapi import Dataset"}],"quickstart":{"code":"import h5netcdf\nimport numpy as np\nimport os\n\nfile_path = 'my_test_data.nc'\n\n# Write data using the new API\nwith h5netcdf.File(file_path, 'w') as f:\n    f.dimensions = {'x': 5, 'y': 3}\n    var = f.create_variable('temperature', ('x', 'y'), 'f4')\n    var[:] = np.random.rand(5, 3)\n    var.units = 'Kelvin'\n    f.create_group('forecast_data')\n\nprint(f\"Successfully wrote data to {file_path}\")\n\n# Read data\nwith h5netcdf.File(file_path, 'r') as f:\n    print(f\"Dimensions: {list(f.dimensions.keys())}\")\n    temp = f.variables['temperature']\n    print(f\"Variable 'temperature' shape: {temp.shape}\")\n    print(f\"Variable 'temperature' units: {temp.units}\")\n    print(f\"First few values: {temp[:2, :2]}\")\n    if 'forecast_data' in f.groups:\n        print(\"Group 'forecast_data' exists.\")\n\n# Clean up\nos.remove(file_path)","lang":"python","description":"This quickstart demonstrates how to create a NetCDF4 file, define dimensions, create a variable, write data, add an attribute, create a group, and then read the data back using the `h5netcdf.File` (new API) interface."},"warnings":[{"fix":"Pass `decode_vlen_strings=True` to `h5netcdf.File()` constructor when opening files with variable-length strings, or handle byte arrays directly.","message":"With h5py version 3.0+, the default behavior for decoding variable-length strings changed from automatically decoding to UTF-8 strings to returning arrays of bytes. To restore the automatic decoding behavior that matches the legacy h5py API and netCDF4-python, explicitly set `decode_vlen_strings=True` in the `h5netcdf.File` constructor.","severity":"breaking","affected_versions":"h5netcdf versions using h5py 3.0+"},{"fix":"For new files, ensure `h5py >= 3.7.0` is installed (h5netcdf will default to `track_order=True`). For existing files created with older versions, be aware that order tracking might be disabled upon reopening; if compatibility with netCDF4-c append operations is critical, recreation with `track_order=True` might be necessary, or explicitly setting the parameter if using older h5py versions.","message":"The `track_order` parameter's default behavior changed in h5netcdf 1.1.0 to `True` (if h5py >= 3.7.0 is detected) for *newly created* netCDF4 files. This ensures compatibility with netCDF4-c. However, files created with older versions of h5netcdf (e.g., 1.0.2 and older, except for 0.13.0) where `track_order=False` was effectively or explicitly set, will continue to open with order tracking disabled in newer h5netcdf versions, potentially leading to interoperability issues if external netCDF4-c tools expect ordered dimensions/variables.","severity":"breaking","affected_versions":"All versions, especially when migrating files created with h5netcdf < 1.1.0 or h5py < 3.7.0."},{"fix":"To allow writing these non-NetCDF4 compliant HDF5 features, pass `invalid_netcdf=True` to the `h5netcdf.File()` constructor. Be aware that such files may not be readable by other netCDF tools.","message":"By default, `h5netcdf` raises a `CompatibilityError` if you attempt to write HDF5 features (like certain data types or arbitrary filters) that are not considered valid NetCDF4 by other tools. While these are valid HDF5, they break NetCDF compatibility. In versions prior to 0.7.3, this was merely a warning.","severity":"gotcha","affected_versions":"All versions, with stricter enforcement since ~0.7.3"},{"fix":"When opening the file, set `phony_dims='sort'` in `h5netcdf.File()` to instruct `h5netcdf` to invent phony dimensions, mimicking standard NetCDF behavior. Alternatively, `phony_dims='access'` can defer phony dimension creation to access time, but with different naming conventions.","message":"If you access variables in an HDF5 file that have no dimension scale associated with one of their axes, `h5netcdf` will raise a `ValueError`. This often occurs with non-NetCDF HDF5 files.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Manually resize dimensions using `group.resize_dimension(dimension, size)` before writing data that would exceed the current dimension size.","message":"When using the new API, automatic resizing of unlimited dimensions with array indexing (e.g., `variable[i, :] = data`) is *not* available. This differs from the `netCDF4-python` library's behavior.","severity":"gotcha","affected_versions":"All versions (new API)"},{"fix":"Cache the `_h5ds` object or its relevant properties if they are accessed repeatedly in a performance-critical loop. For example, store `variable.shape` in a local variable if it's constant for the loop's duration.","message":"Repeated access to properties that rely on the underlying `_h5ds` HDF5 dataset object can be costly in terms of performance, as `_h5ds` is created on demand. This can impact workflows that frequently query properties like `variable.shape` in a loop.","severity":"gotcha","affected_versions":"All versions"},{"fix":"When wrapping an `h5py.File` object, ensure you explicitly close the original `h5py.File` object when it's no longer needed to release resources.","message":"If you initialize `h5netcdf.File` by passing an existing `h5py.File` object (e.g., `h5netcdf.File(h5py_file_obj)`), closing the `h5netcdf.File` wrapper will *not* close the underlying `h5py.File` object. However, if the file is opened by path (e.g., `h5netcdf.File('mydata.nc')`), closing the `h5netcdf.File` *will* close the underlying HDF5 file.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}