{"id":4597,"library":"kaldi-native-fbank","title":"Kaldi Native Fbank","description":"Kaldi-native-fbank is a Python library providing a Kaldi-compatible online filter bank (fbank) feature extractor. It is designed to be efficient and has no external native dependencies, aiming for seamless integration across various architectures and operating systems. The library is actively maintained with frequent releases, with the current stable version being 1.22.3.","status":"active","version":"1.22.3","language":"en","source_language":"en","source_url":"https://github.com/csukuangfj/kaldi-native-fbank","tags":["audio processing","speech recognition","feature extraction","fbank","kaldi"],"install":[{"cmd":"pip install kaldi-native-fbank","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Used in common examples and for convenient waveform generation/handling, but not a hard dependency for the core library.","package":"torch","optional":true},{"reason":"Often used for array manipulation and is compatible with `accept_waveform` which takes list or numpy array.","package":"numpy","optional":true}],"imports":[{"symbol":"kaldi_native_fbank","correct":"import kaldi_native_fbank as knf"}],"quickstart":{"code":"import kaldi_native_fbank as knf\nimport torch\nimport numpy as np\n\n# Configure Fbank options\nopts = knf.FbankOptions()\nopts.frame_opts.dither = 0.0\nopts.mel_opts.num_bins = 80\nopts.frame_opts.snip_edges = False\nopts.mel_opts.debug_mel = False\n\nsampling_rate = 16000\n# Generate 10 seconds of random audio samples (simulating real audio)\n# Using torch.randn for convenience, convert to list or numpy array for `accept_waveform`\nsamples_tensor = torch.randn(sampling_rate * 10)\nsamples = samples_tensor.tolist() # kaldi_native_fbank expects list or numpy array\n\n# Initialize the online Fbank extractor\nfbank_extractor = knf.OnlineFbank(opts)\n\n# Process the waveform\nfbank_extractor.accept_waveform(sampling_rate, samples)\n\n# Retrieve the number of frames available\nnum_frames = fbank_extractor.num_frames_ready\nprint(f\"Number of frames ready: {num_frames}\")\n\n# Retrieve and print the first frame\nif num_frames > 0:\n    first_frame = fbank_extractor.get_frame(0)\n    print(f\"Shape of the first frame: {first_frame.shape}\")\n    print(f\"First frame (first 5 values): {first_frame[:5].round(decimals=4)}\")","lang":"python","description":"This quickstart demonstrates how to initialize the `OnlineFbank` extractor with `FbankOptions` and process a waveform. Note that `kaldi_native_fbank.OnlineFbank.accept_waveform` expects input samples as a Python list or NumPy array, unlike some other libraries that might accept `torch.Tensor` directly. The example uses `torch.randn` for convenience to generate sample data, which is then converted to a list."},"warnings":[{"fix":"Consult the Kaldi documentation or `kaldi-native-fbank` source for feature computation details. Apply appropriate normalization or transformation if integrating with different feature pipelines.","message":"Kaldi's Fbank features are typically in log space and might have different scaling or representation compared to filter bank features generated by other Python speech processing libraries (e.g., `python_speech_features`). Always verify the feature specifications if integrating with models trained on features from other sources.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If `torch` is not installed or desired, convert your audio data to `list` or `numpy.ndarray` (e.g., `your_audio_tensor.numpy().tolist()`) before calling `accept_waveform`.","message":"While `kaldi-native-fbank` itself is advertised as 'without external dependencies' (referring to native libraries), its Python usage examples, including the official ones, often utilize `torch` for waveform generation and comparison. If you're not using `torch`, ensure your audio data is converted to a standard Python list or NumPy array before passing it to methods like `accept_waveform`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}