Kaldi Native Fbank

1.22.3 · active · verified Sun Apr 12

Kaldi-native-fbank is a Python library providing a Kaldi-compatible online filter bank (fbank) feature extractor. It is designed to be efficient and has no external native dependencies, aiming for seamless integration across various architectures and operating systems. The library is actively maintained with frequent releases, with the current stable version being 1.22.3.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `OnlineFbank` extractor with `FbankOptions` and process a waveform. Note that `kaldi_native_fbank.OnlineFbank.accept_waveform` expects input samples as a Python list or NumPy array, unlike some other libraries that might accept `torch.Tensor` directly. The example uses `torch.randn` for convenience to generate sample data, which is then converted to a list.

import kaldi_native_fbank as knf
import torch
import numpy as np

# Configure Fbank options
opts = knf.FbankOptions()
opts.frame_opts.dither = 0.0
opts.mel_opts.num_bins = 80
opts.frame_opts.snip_edges = False
opts.mel_opts.debug_mel = False

sampling_rate = 16000
# Generate 10 seconds of random audio samples (simulating real audio)
# Using torch.randn for convenience, convert to list or numpy array for `accept_waveform`
samples_tensor = torch.randn(sampling_rate * 10)
samples = samples_tensor.tolist() # kaldi_native_fbank expects list or numpy array

# Initialize the online Fbank extractor
fbank_extractor = knf.OnlineFbank(opts)

# Process the waveform
fbank_extractor.accept_waveform(sampling_rate, samples)

# Retrieve the number of frames available
num_frames = fbank_extractor.num_frames_ready
print(f"Number of frames ready: {num_frames}")

# Retrieve and print the first frame
if num_frames > 0:
    first_frame = fbank_extractor.get_frame(0)
    print(f"Shape of the first frame: {first_frame.shape}")
    print(f"First frame (first 5 values): {first_frame[:5].round(decimals=4)}")

view raw JSON →