netCDF4 Python Interface
The netCDF4-python library provides an object-oriented Python interface to the netCDF C library, enabling interaction with NetCDF version 4 and older NetCDF 3 format files. It leverages HDF5 for advanced features like hierarchical groups, zlib compression, and new data types. Currently at version 1.7.4, the library is actively maintained with frequent releases, including several minor updates and patches each year.
Warnings
- breaking Starting with version 1.7.4 and when using free-threaded Python (e.g., Python 3.13+ with `PYTHON_FREETHREADING=1`), `netcdf4-python` may experience segfaults if `netCDF4` functions are called from multiple threads concurrently. The underlying `netcdf-c` library is not thread-safe, and while `netcdf4-python` has internal locking, care must be taken.
- gotcha Installation via `pip` usually provides pre-compiled binary wheels that bundle the necessary `netCDF C` and `HDF5 C` libraries. However, if building from source, or on less common systems, these external C libraries must be installed separately and configured correctly (e.g., via `nc-config` or environment variables) for `netcdf4-python` to compile and function.
- breaking Prior to version 1.4.0, `netcdf4-python` would only return masked arrays if a slice of data explicitly contained missing values. From version 1.4.0 onwards, the default behavior changed to always return masked arrays for primitive and enum data types if `missing_value` or `_FillValue` attributes are defined, regardless of whether the slice contains actual missing data.
- deprecated Many examples and older codebases use `datetime.datetime.utcnow()` when assigning time attributes (e.g., to `nc.history`). Python's `datetime.utcnow()` is deprecated and will raise `DeprecationWarning` in modern Python versions, scheduled for removal in future versions.
- breaking Versions 1.7.0 and 1.7.1 introduced a regression that prevented opening remote OPeNDAP files, resulting in `curl` errors. This issue was likely resolved in subsequent patch releases.
Install
-
pip install netCDF4 -
conda install -c conda-forge netcdf4
Imports
- Dataset
from netCDF4 import Dataset
Quickstart
import os
from netCDF4 import Dataset
import numpy as np
# Define a dummy NetCDF file path
filename = 'example.nc'
# Create a new NetCDF file in write mode ('w')
with Dataset(filename, 'w', format='NETCDF4') as nc_file:
# Create dimensions
nc_file.createDimension('x', 10)
nc_file.createDimension('y', 5)
nc_file.createDimension('time', None) # 'None' for unlimited dimension
# Create variables
x_var = nc_file.createVariable('x', 'i4', ('x',))
y_var = nc_file.createVariable('y', 'i4', ('y',))
time_var = nc_file.createVariable('time', 'f8', ('time',))
data_var = nc_file.createVariable('temperature', 'f4', ('time', 'y', 'x'))
# Add attributes to variables
data_var.units = 'Celsius'
data_var.long_name = 'Air Temperature'
# Write data to variables
x_var[:] = np.arange(10)
y_var[:] = np.arange(5)
# Write data for the first time step
time_var[0] = 0.0
data_var[0, :, :] = np.random.rand(5, 10) * 30 + 273.15 # Kelvin example
# Write data for a second time step (demonstrates unlimited dimension)
time_var[1] = 1.0
data_var[1, :, :] = np.random.rand(5, 10) * 30 + 273.15
print(f"Successfully created and wrote to {filename}")
# Read data from the NetCDF file in read mode ('r')
with Dataset(filename, 'r') as nc_file:
print(f"\nOpened {filename} for reading:")
print(f"File format: {nc_file.data_model}")
print(f"Dimensions: {list(nc_file.dimensions.keys())}")
print(f"Variables: {list(nc_file.variables.keys())}")
temp_data = nc_file.variables['temperature'][:, :, :]
print(f"Shape of 'temperature' data: {temp_data.shape}")
print(f"Units of 'temperature': {nc_file.variables['temperature'].units}")
print(f"Sample temperature data:\n{temp_data[0, 0, 0]:.2f} {nc_file.variables['temperature'].units}")
# Clean up the created file
os.remove(filename)
print(f"\nCleaned up {filename}")