Kaldi-ark loading and writing module
kaldiio is a pure Python module for reading and writing Kaldi ark and scp files. It provides utilities for handling various Kaldi object types, including binary/text matrices, vectors, and compressed matrices, by interacting with NumPy arrays. The library also facilitates I/O operations through Unix pipes and supports some extended ark formats, including those used for NumPy, Pickle, WAV, and FLAC files. It is actively maintained with frequent minor and patch releases.
Warnings
- gotcha Users often confuse `kaldiio` with older or related libraries like `kaldi_io` (from Karel Vesely) or `torchaudio.kaldi_io`. The `torchaudio.kaldi_io` module, in particular, is deprecated since version 2.8 and will be removed in 2.9, potentially leading to breaking changes if not using `kaldiio` directly.
- breaking The `wav` option, previously available in `ReadHelper` for WAV file processing, was removed in version 2.11.0. Code relying on this option will break.
- breaking Version 2.16.0 introduced support for an 'extended ark format'. Older versions of `kaldiio` (prior to 2.16.0) might not be able to correctly read or fully utilize `.ark` files created with these extended formats by newer `kaldiio` versions or other Kaldi tools.
- gotcha Kaldi's I/O often involves complex `rspecifiers` and `wspecifiers` which can include UNIX pipes (e.g., `ark:gunzip -c file.ark.gz |`). Incorrect formatting or environment issues with pipe commands are common pitfalls. `kaldiio` explicitly supports 'pipe fashion' for functions like `load_scp` (since v2.15.0) and `ReadHelper`.
Install
-
pip install kaldiio
Imports
- load_ark
from kaldiio import load_ark
- load_scp
from kaldiio import load_scp
- save_ark
from kaldiio import save_ark
- ReadHelper
from kaldiio import ReadHelper
- WriteHelper
from kaldiio import WriteHelper
- open_like_kaldi
from kaldiio import open_like_kaldi
Quickstart
import numpy as np
from kaldiio import load_scp, ReadHelper, WriteHelper, save_ark
# Example: Create dummy data for writing
utt_ids = ['utt1', 'utt2', 'utt3']
features = {
'utt1': np.random.rand(100, 40).astype(np.float32),
'utt2': np.random.rand(120, 40).astype(np.float32),
'utt3': np.random.rand(90, 40).astype(np.float32)
}
# Write features to an ARK file and generate an SCP file
output_ark_path = 'output.ark'
output_scp_path = 'output.scp'
save_ark(output_ark_path, features, scp=output_scp_path)
print(f"Written to {output_ark_path} and {output_scp_path}")
# Read SCP file using ReadHelper (sequential access, supports pipes)
print("\nReading via ReadHelper (scp:)...")
with ReadHelper(f'scp:{output_scp_path}') as reader:
for key, matrix in reader:
print(f"Key: {key}, Matrix shape: {matrix.shape}")
# Read SCP file using load_scp (random access, returns dict-like object)
print("\nReading via load_scp (random access)...")
loaded_features = load_scp(output_scp_path)
print(f"Loaded 'utt2' shape: {loaded_features['utt2'].shape}")
# Cleanup (optional)
import os
os.remove(output_ark_path)
os.remove(output_scp_path)