{"id":2548,"library":"jiwer","title":"Jiwer","description":"Jiwer is a simple and fast Python package designed to evaluate Automatic Speech Recognition (ASR) systems. It computes similarity measures such as Word Error Rate (WER), Match Error Rate (MER), Word Information Lost (WIL), Word Information Preserved (WIP), and Character Error Rate (CER). It uses RapidFuzz, which leverages C++ under the hood, for efficient minimum-edit distance calculations, making it faster than pure Python implementations. The current version is 4.0.0, released in June 2025, and it maintains an active development and release cadence.","status":"active","version":"4.0.0","language":"en","source_language":"en","source_url":"https://github.com/jitsi/jiwer","tags":["speech-to-text","asr-evaluation","word-error-rate","wer","character-error-rate","cer","nlp","evaluation-metrics"],"install":[{"cmd":"pip install jiwer","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Used for efficient minimum-edit distance calculation, providing core performance benefits.","package":"rapidfuzz","optional":false},{"reason":"Required for the command-line interface (CLI) functionality.","package":"click","optional":true}],"imports":[{"symbol":"wer","correct":"from jiwer import wer"},{"symbol":"cer","correct":"from jiwer import cer"},{"note":"The 'compute_measures' function was renamed to 'process_words' in version 4.0.0. The return type also changed from a dictionary to a dataclass (WordOutput).","wrong":"output = jiwer.compute_measures(reference, hypothesis)","symbol":"process_words","correct":"import jiwer\noutput = jiwer.process_words(reference, hypothesis)"},{"symbol":"process_characters","correct":"import jiwer\noutput = jiwer.process_characters(reference, hypothesis)"},{"note":"Used for creating transformation pipelines for text normalization.","symbol":"Compose","correct":"from jiwer import Compose"}],"quickstart":{"code":"import jiwer\n\n# Calculate Word Error Rate (WER) for single strings\nreference_single = \"hello world\"\nhypothesis_single = \"hello duck\"\nerror_single = jiwer.wer(reference_single, hypothesis_single)\nprint(f\"WER (single): {error_single}\")\n\n# Calculate WER for multiple sentences (lists of strings)\nreferences_multiple = [\"hello world\", \"i like monthy python\"]\nhypotheses_multiple = [\"hello duck\", \"i like python\"]\nerror_multiple = jiwer.wer(references_multiple, hypotheses_multiple)\nprint(f\"WER (multiple): {error_multiple}\")\n\n# Get detailed output including alignments and all measures\noutput_details = jiwer.process_words(reference_single, hypothesis_single)\nprint(f\"\\nDetailed output WER: {output_details.wer}\")\nprint(f\"Detailed output MER: {output_details.mer}\")\nprint(f\"Alignments: {output_details.alignments}\")","lang":"python","description":"This quickstart demonstrates how to calculate the Word Error Rate (WER) for both single and multiple reference/hypothesis pairs using `jiwer.wer()`. It also shows how to use `jiwer.process_words()` to obtain a more detailed output, including various error measures and the alignment between the reference and hypothesis."},"warnings":[{"fix":"Update calls from `jiwer.compute_measures()` to `jiwer.process_words()` and `jiwer.visualize_measures()` to `jiwer.visualize_alignment()`. Adjust code to access results from the returned `WordOutput` or `CharacterOutput` dataclass attributes (e.g., `output.wer`) instead of dictionary keys.","message":"The functions `jiwer.compute_measures()` and `jiwer.visualize_measures()` were renamed in version 4.0.0. They are now `jiwer.process_words()` and `jiwer.visualize_alignment()` respectively. Additionally, `process_words` returns a `WordOutput` dataclass instead of a dictionary.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Review existing code that processes empty or potentially empty reference/hypothesis pairs. The new behavior is generally safer, but ensure it aligns with your specific evaluation logic.","message":"The behavior for handling empty reference sentences changed in version 4.0.0. Previously, an empty reference with an empty hypothesis could lead to undefined behavior (division by zero). As of 4.0.0, this scenario is explicitly defined to yield zero error, supporting evaluation for models hallucinating on silent audio.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"If you are directly inspecting or parsing the `alignments` output from `process_words()` or `process_characters()`, update your code to access attributes of the `AlignmentChunk` dataclass (e.g., `chunk.type`, `chunk.ref_start_idx`) instead of tuple indices.","message":"The internal representation of alignment chunks changed in version 4.0.0. Alignments are now returned as a list of `jiwer.AlignmentChunk` dataclass objects, replacing the previous tuple-based format. This improves clarity and accessibility of alignment details.","severity":"breaking","affected_versions":">=4.0.0"},{"fix":"Understand that WER > 1.0 is an expected and valid outcome, indicating poor ASR performance with many extraneous words. Do not cap the WER at 1.0 in your reporting or analysis unless specifically required by a particular standard.","message":"Word Error Rate (WER) can exceed 100% (or 1.0). This occurs when the total number of errors (substitutions, deletions, and insertions) is greater than the number of words in the reference text, often due to a high number of insertions by the ASR system.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If sentence-by-sentence error rates are needed, you must iterate through your sentence pairs and call `jiwer.wer()` (or `jiwer.cer()`) for each pair individually, or use `jiwer.process_words()`/`process_characters()` and then access the `.wer` or `.cer` attribute of the returned object for each item in the results list.","message":"When `jiwer.wer()` or `jiwer.cer()` are provided with lists of reference and hypothesis sentences, they internally concatenate all sentences to compute a *single, global* error rate for the entire dataset, which is standard for corpus-level evaluation. It does not return individual error rates per sentence.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Utilize `jiwer.Compose()` with transformation functions like `jiwer.ToLowerCase()`, `jiwer.RemovePunctuation()`, `jiwer.ExpandCommonEnglishContractions()`, etc., to build a preprocessing pipeline. Apply this pipeline to your reference and hypothesis texts before computing error rates.","message":"Jiwer calculates metrics on the raw input strings. For robust and fair evaluation, it is crucial to apply consistent text normalization (e.g., lowercasing, punctuation removal, expansion of contractions) to both reference and hypothesis strings *before* calling jiwer functions.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}