{"id":640,"library":"h5py","title":"h5py","description":"The h5py package provides a Pythonic interface to the HDF5 binary data format, allowing users to store and manipulate large amounts of numerical data efficiently, often integrating seamlessly with NumPy arrays. It offers both high-level and low-level access to HDF5 files, datasets, and groups. The current version is 3.16.0, with development actively maintained through frequent releases.","status":"active","version":"3.16.0","language":"python","source_language":"en","source_url":"https://github.com/h5py/h5py","tags":["hdf5","data-storage","scientific-computing","numpy","io","binary-data"],"install":[{"cmd":"pip install h5py","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"h5py is designed to work closely with NumPy arrays for data manipulation and storage.","package":"numpy","optional":false},{"reason":"h5py is a Python wrapper for the HDF5 C library. While pre-built wheels often bundle this, custom installations or specific features (like Parallel HDF5) require a separately installed HDF5 C library with development headers.","package":"hdf5 (C library)","optional":false},{"reason":"Required for enabling Parallel HDF5 features, which allow parallel writing to HDF5 files across multiple processes. Note that h5py and the HDF5 C library must also be compiled with MPI support.","package":"mpi4py","optional":true}],"imports":[{"note":"The `File` object is the primary entry point for interacting with HDF5 files.","symbol":"File","correct":"import h5py\n# ...\nh5py.File(...)"}],"quickstart":{"code":"import h5py\nimport numpy as np\nimport os\n\nfile_path = 'my_data.h5'\n\n# Create a new HDF5 file (mode 'w' will overwrite if exists)\nwith h5py.File(file_path, 'w') as f:\n    # Create a group (like a directory)\n    group = f.create_group('my_group')\n    \n    # Create a dataset within the group (like a NumPy array)\n    data = np.arange(100).reshape(10, 10)\n    dset = group.create_dataset('dataset_1', data=data)\n    \n    # Add attributes to the dataset (metadata)\n    dset.attrs['units'] = 'arbitrary'\n    dset.attrs['description'] = 'Sample 2D integer array'\n    \n    # You can also create datasets directly at the root level\n    f.create_dataset('another_dataset', data=np.random.rand(5))\n\nprint(f\"File '{file_path}' created successfully.\")\n\n# Read data from the HDF5 file\nwith h5py.File(file_path, 'r') as f:\n    # List all top-level objects\n    print(f\"\\nKeys in file: {list(f.keys())}\")\n    \n    # Access a group\n    group_read = f['my_group']\n    print(f\"Keys in 'my_group': {list(group_read.keys())}\")\n    \n    # Access a dataset\n    dset_read = group_read['dataset_1']\n    \n    # Read data into memory (using array-style slicing for the whole dataset)\n    read_data = dset_read[()]\n    print(f\"\\nShape of read_data: {read_data.shape}\")\n    print(f\"First 5 elements of read_data: {read_data.flatten()[:5]}\")\n    \n    # Access attributes\n    print(f\"Units attribute: {dset_read.attrs['units']}\")\n    \n    # Read a slice of the data\n    slice_data = dset_read[0:5, 0:5]\n    print(f\"Slice (0:5, 0:5) of dataset_1:\\n{slice_data}\")\n\n# Clean up the created file\nos.remove(file_path)","lang":"python","description":"This quickstart demonstrates how to create an HDF5 file, add groups and datasets, store NumPy arrays, attach metadata as attributes, and then read the data and attributes back. It emphasizes using context managers (`with h5py.File(...)`) for proper file handling."},"warnings":[{"fix":"Always explicitly specify the file mode (e.g., `h5py.File('file.h5', 'w')` for write, `h5py.File('file.h5', 'a')` for append, `h5py.File('file.h5', 'r+')` for read/write).","message":"The default mode for opening HDF5 files changed from read/write to read-only ('r') in h5py 3.0. Attempting to write without explicitly setting a write-enabled mode (e.g., 'w', 'a', 'r+') will result in an error.","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"Upgrade your Python environment to 3.10 or newer to use current h5py versions.","message":"h5py 3.0 and newer versions dropped support for Python 2.7. Python 3.6 or above is now required. For h5py 3.12, Python 3.9 or newer is required. For h5py 3.15, Python 3.10 or newer is required.","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"Use NumPy-style slicing to read the entire dataset: `mydataset[()]` or `mydataset[...]`.","message":"The `Dataset.value` property, which would dump the entire dataset into a NumPy array, was deprecated in h5py 2.0 and later removed in h5py 3.0. Using it will lead to errors in recent versions.","severity":"deprecated","affected_versions":">=2.0.0"},{"fix":"Always use the `with h5py.File(...) as f:` context manager, which ensures the file is closed even if errors occur.","message":"HDF5 files must be explicitly closed to ensure data integrity, especially after writing. Failing to do so can lead to corrupted files or unreleased file handles.","severity":"gotcha","affected_versions":"all"},{"fix":"Explicitly specify the desired `dtype` when creating datasets, e.g., `group.create_dataset('name', data=my_array, dtype=np.float64)` or `group.create_dataset('name', shape=(...), dtype='f8')` for double precision.","message":"The default `dtype` for `group.create_dataset()` is `numpy.float32` ('f'), which is different from NumPy's default `numpy.float64`. This can cause silent data type changes and potential precision loss if not explicitly specified.","severity":"gotcha","affected_versions":"all"},{"fix":"For parallel I/O, consider using multiprocessing with explicit file closing in each process, or compile h5py and HDF5 with MPI support and use `mpi4py` for true Parallel HDF5 (for writing). Parallel *read* access is generally safe from separate processes.","message":"Using h5py with multiple threads (Python's `threading` module) will not provide parallel performance for HDF5 operations. The underlying `libhdf5` C library is generally not thread-safe, and h5py uses a global Python lock to serialize access to the HDF5 C API, preventing simultaneous calls.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-05-12T17:06:37.533Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"Install h5py using pip: `pip install h5py` or using conda: `conda install h5py`.","cause":"The h5py library is not installed in the Python environment being used, or the environment where it's installed is not activated.","error":"ModuleNotFoundError: No module named 'h5py'"},{"fix":"Verify the file's integrity and ensure it is a legitimate HDF5 file. Try re-downloading the file if it came from an external source.","cause":"This error indicates that the file you are trying to open is either corrupted, not a valid HDF5 file, or was improperly downloaded.","error":"OSError: Unable to open file (File signature not found)"},{"fix":"Check the exact path and name of the dataset or group. Use `file.keys()` or `list(file.keys())` to see top-level objects, and `group.keys()` to inspect contents of groups, to ensure the object name is correct.","cause":"This error occurs when you try to access a dataset or group within an HDF5 file that does not exist at the specified path.","error":"KeyError: \"Unable to open object (object 'data' doesn't exist)\""},{"fix":"Convert your data to a supported NumPy numerical dtype (e.g., `np.float32`, `np.int64`) or a fixed-length or variable-length string dtype using `h5py.string_dtype()` before writing. If storing complex Python objects, you may need to serialize them (e.g., using pickle) and store them as byte strings.","cause":"h5py cannot directly store generic Python objects or NumPy arrays with `dtype=object` (which can hold mixed types like lists, dictionaries, or arbitrary Python objects) in an HDF5 dataset, as HDF5 is fundamentally designed for homogeneous numerical data.","error":"ValueError: Object dtype dtype('O') has no native HDF5 equivalent"},{"fix":"Ensure you are accessing an actual `h5py.Dataset` object. Iterate through the group's contents and use `isinstance()` to distinguish between `h5py.Group` and `h5py.Dataset` objects before attempting to read data or access dataset-specific attributes. To get the data from a dataset, use `dataset[...]` or `dataset[()]`.","cause":"You are attempting to access a dataset-specific attribute (like `dtype`, `shape`, or `value`/`[:]`) on an `h5py.Group` object, which is a container for other HDF5 objects (datasets or other groups), not a dataset itself.","error":"AttributeError: 'Group' object has no attribute 'dtype'"}],"ecosystem":"pypi","meta_description":null,"install_score":97,"install_tag":"verified","quickstart_score":77,"quickstart_tag":"verified","pypi_latest":"3.16.0","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.33,"mem_mb":9.1,"disk_size":"105.2M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":9.1,"disk_size":"105.2M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.9,"import_time_s":0.24,"mem_mb":9.1,"disk_size":"102M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.21,"mem_mb":9.1,"disk_size":"102M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.45,"mem_mb":9.9,"disk_size":"113.0M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.42,"mem_mb":9.9,"disk_size":"113.0M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.7,"import_time_s":0.42,"mem_mb":9.9,"disk_size":"109M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.36,"mem_mb":9.9,"disk_size":"109M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.4,"mem_mb":9.8,"disk_size":"102.8M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.34,"mem_mb":9.8,"disk_size":"102.8M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.6,"import_time_s":0.4,"mem_mb":9.8,"disk_size":"98M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.35,"mem_mb":9.8,"disk_size":"98M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.32,"mem_mb":10.1,"disk_size":"102.3M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.32,"mem_mb":10.1,"disk_size":"102.2M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.7,"import_time_s":0.37,"mem_mb":10.1,"disk_size":"98M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.35,"mem_mb":10.1,"disk_size":"98M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":4.6,"import_time_s":0.3,"mem_mb":8.8,"disk_size":"110M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.22,"mem_mb":8.8,"disk_size":"110M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":"verified","tag_description":"quickstart runs on critical runtimes, recently tested","results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":0}]}}