Faster Whisper

1.2.1 · active · verified Thu Apr 09

Faster Whisper is a re-implementation of OpenAI's Whisper model using CTranslate2, which allows for faster inference and reduced memory usage. It is highly optimized for CPU and GPU, supporting various compute types. The current version is 1.2.1, with an active release cadence, frequently adding new features, model support, and performance improvements.

Warnings

Install

Imports

Quickstart

Demonstrates loading a Whisper model and transcribing an audio file. The model will automatically download from Hugging Face Hub if not already cached. Uses CPU by default for broad compatibility; change `device` and `compute_type` for GPU acceleration.

from faster_whisper import WhisperModel
import os

# Ensure you have an audio file named 'audio.mp3' in the current directory
# For example, download a short audio clip or record one.
# Example: https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3

model_size = os.environ.get('WHISPER_MODEL_SIZE', 'tiny.en') # e.g., 'large-v3', 'medium', 'tiny.en'

# Run on CPU with INT8 compute type for general compatibility
# For GPU, change device='cuda' and compute_type='float16' if supported
model = WhisperModel(model_size, device='cpu', compute_type='int8')

# Transcribe the audio file
# Replace 'audio.mp3' with the path to your audio file
segments, info = model.transcribe("audio.mp3", beam_size=5)

print(f"Detected language '{info.language}' with probability {info.language_probability:.2f}")

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

view raw JSON →