RosettaSciIO
RosettaSciIO is a Python library for reading and writing various scientific file formats, designed with a focus on ease of use and integration with scientific data analysis tools like HyperSpy. It supports a wide range of formats including HDF5, TIFF, DigitalMicrograph (DM3/DM4), and more, often leveraging optional backend libraries. The current version is 0.13.0, and new minor releases occur every few months, introducing new features and format support.
Common errors
-
ModuleNotFoundError: No module named 'dask'
cause Attempting to use a feature (e.g., lazy loading, specific file format readers like Gatan DM) that relies on `dask` without it being installed. `dask` is an optional dependency.fixInstall `rosettasciio` with the necessary extras, for example, `pip install "rosettasciio[gatan-dm]"` or `pip install "rosettasciio[all]"`. -
ValueError: No reader found for file_format: dm3
cause Trying to read a file format (e.g., `.dm3`, advanced `.tif`) for which the required backend library is not installed, or the format is simply not supported by any available reader.fixCheck if the format is supported by `rosettasciio`. If it is, install `rosettasciio` with the relevant optional dependencies. For example, for `.dm3` files, use `pip install "rosettasciio[gatan-dm]"`. -
PermissionError: [Errno 13] Permission denied: 'my_output_file.hdf5'
cause The user account running the script does not have write permissions to the specified output directory when trying to save a file, or read permissions when trying to open.fixChange the output directory to one where the user has write permissions, or adjust the file system permissions for the target path or file.
Warnings
- gotcha Many advanced features or support for specific file formats (e.g., Gatan DM3/DM4, advanced TIFF, lazy loading with Dask, HyperSpy integration) require *optional* dependencies. These are not installed by default.
- breaking `rosettasciio` requires Python 3.10 or newer. Attempting to use older Python versions will lead to installation failures or runtime errors due to syntax or dependency incompatibilities.
- gotcha For efficient handling of very large files and distributed lazy loading, `dask` and often `hyperspy` are required optional dependencies. Without them, large files might be fully loaded into memory, potentially causing memory exhaustion.
- gotcha While `rsciio.read` can often return a `numpy.ndarray` by default, full integration with `hyperspy` signals for advanced analysis requires `hyperspy` to be installed.
Install
-
pip install rosettasciio -
pip install "rosettasciio[all]"
Imports
- read
from rsciio import read
- write
from rsciio import write
Quickstart
import rsciio
import numpy as np
import os
# Create a dummy NumPy array (e.g., a simple image)
data_to_save = np.arange(100).reshape(10, 10).astype(np.uint8)
filename_png = "my_dummy_data.png"
print(f"Saving data to {filename_png} using rsciio.write...")
# rsciio.write handles numpy arrays and uses Pillow (a core dependency) for PNG.
rsciio.write(data_to_save, filename_png)
print(f"Successfully saved data to {filename_png}")
print(f"Reading data from {filename_png} using rsciio.read...")
# rsciio.read returns a numpy array by default
read_data = rsciio.read(filename_png)
print(f"Successfully read data from {filename_png}. Shape: {read_data.shape}")
print(f"Data type: {read_data.dtype}")
# Clean up the dummy file
os.remove(filename_png)
print(f"Cleaned up {filename_png}")