{"id":1745,"library":"thefuzz","title":"thefuzz (Fuzzy String Matching)","description":"thefuzz is a Python library for fuzzy string matching, based on Levenshtein distance. It provides a simple API for comparing strings and extracting best matches from collections. The current version is 0.22.1, and it maintains an active development pace with periodic releases.","status":"active","version":"0.22.1","language":"en","source_language":"en","source_url":"https://github.com/seatgeek/thefuzz","tags":["string matching","fuzzy logic","levenshtein distance","nlp","data cleaning"],"install":[{"cmd":"pip install thefuzz","lang":"bash","label":"Base installation"},{"cmd":"pip install thefuzz[speedup]","lang":"bash","label":"With C-speedup (recommended)"}],"dependencies":[{"reason":"Provides C-level speedups for string comparison algorithms. Highly recommended for performance-critical applications.","package":"python-Levenshtein","optional":true}],"imports":[{"symbol":"fuzz","correct":"from thefuzz import fuzz"},{"symbol":"process","correct":"from thefuzz import process"},{"note":"The original 'fuzzywuzzy' library has been renamed and is now 'thefuzz'. Old imports will fail.","wrong":"from fuzzywuzzy import fuzz, process","symbol":"fuzzywuzzy","correct":"from thefuzz import fuzz, process"}],"quickstart":{"code":"from thefuzz import fuzz\nfrom thefuzz import process\n\n# Basic string comparison\nscore = fuzz.ratio(\"this is a test\", \"this is a test!\")\nprint(f\"Ratio score: {score}\")\n\n# Find the best match in a list\nchoices = [\"apple pie\", \"grapefruit\", \"apple tree\"]\nquery = \"apple\"\nbest_match, best_score = process.extractOne(query, choices)\nprint(f\"Best match for '{query}': '{best_match}' with score {best_score}\")\n\n# Get top N matches\ntop_matches = process.extract(query, choices, limit=2)\nprint(f\"Top matches for '{query}': {top_matches}\")","lang":"python","description":"This example demonstrates basic ratio calculation between two strings and how to find the best (or top N) matches for a query string within a list of choices using `fuzz` and `process` modules."},"warnings":[{"fix":"Update all `fuzzywuzzy` imports to `thefuzz`. Install `python-Levenshtein` explicitly for performance: `pip install thefuzz[speedup]`.","message":"The library was renamed from `fuzzywuzzy` to `thefuzz`. Direct imports of `fuzzywuzzy` will no longer work, and `python-Levenshtein` is now an optional dependency.","severity":"breaking","affected_versions":"< 0.20.0 (fuzzywuzzy) to >= 0.20.0 (thefuzz)"},{"fix":"Always install `thefuzz` with the speedup extras: `pip install thefuzz[speedup]`. Ensure `python-Levenshtein` is successfully installed and not just `thefuzz` by itself.","message":"Performance degrades significantly without the optional `python-Levenshtein` dependency (often referred to as 'speedup'). The library falls back to a pure Python implementation which is much slower.","severity":"gotcha","affected_versions":"All versions of `thefuzz`"},{"fix":"When using `extractOne`, assign to two variables: `matched_string, score = process.extractOne(...)`. When using `extract`, iterate over the list of tuples: `for item, score in process.extract(...)`.","message":"The `process.extract` and `process.extractOne` functions return tuples, where the first element is the matched string and the second is the score. Be careful when destructuring the results.","severity":"gotcha","affected_versions":"All versions of `thefuzz`"},{"fix":"Understand the differences: `ratio` is for exact order, `partial_ratio` for substrings, `token_sort_ratio` for reordered words, and `token_set_ratio` for missing/extra words. Choose based on your specific string comparison needs.","message":"Different ratio functions (`fuzz.ratio`, `fuzz.partial_ratio`, `fuzz.token_sort_ratio`, `fuzz.token_set_ratio`) are suited for different scenarios. Using the wrong one can lead to unintuitive results.","severity":"gotcha","affected_versions":"All versions of `thefuzz`"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}