python-levenshtein
python-levenshtein is a Python C extension module providing highly optimized functions for computing string edit distances (like Levenshtein distance), similarity ratios, and related metrics. While the package itself has been renamed to `levenshtein` and is actively maintained under that name by the RapidFuzz team, the `python-levenshtein` PyPI package (version 0.27.3) continues to be updated as a compatibility wrapper. It maintains a positive release cadence.
Warnings
- breaking The primary development for this library now occurs under the `levenshtein` PyPI package. `python-levenshtein` is maintained as a compatibility wrapper. It's recommended to install `levenshtein` directly for new projects and for accessing the latest features and fixes, though `pip install python-levenshtein` will still work by installing `levenshtein` as a dependency.
- gotcha The library is licensed under GPL-2.0. This copyleft license can be restrictive for projects with different licensing requirements, as it may necessitate that derivative works also be licensed under GPL.
- breaking Recent versions of the underlying `levenshtein` library (which `python-levenshtein` now wraps) have dropped support for older Python versions. For example, version 0.26.0 dropped support for Python 3.8, and 0.27.0 requires Python 3.10 or later.
- gotcha While `python-levenshtein` (and `levenshtein`) is highly optimized due to its C extension, for extensive fuzzy matching, especially with large datasets or when a wider array of string similarity algorithms (e.g., Jaro-Winkler, token-based matching) is needed, the `rapidfuzz` library is often a more modern and performant alternative. `rapidfuzz` also offers more flexible licensing (MIT).
Install
-
pip install python-levenshtein
Imports
- Levenshtein
import Levenshtein
Quickstart
import Levenshtein
str1 = "kitten"
str2 = "sitting"
# Calculate Levenshtein distance
distance = Levenshtein.distance(str1, str2)
print(f"Levenshtein distance between '{str1}' and '{str2}': {distance}")
# Calculate similarity ratio
ratio = Levenshtein.ratio(str1, str2)
print(f"Similarity ratio between '{str1}' and '{str2}': {ratio:.2f}")
# Example with different strings
str3 = "hello"
str4 = "hallo"
distance2 = Levenshtein.distance(str3, str4)
ratio2 = Levenshtein.ratio(str3, str4)
print(f"\nLevenshtein distance between '{str3}' and '{str4}': {distance2}")
print(f"Similarity ratio between '{str3}' and '{str4}': {ratio2:.2f}")