fuzzyset2: Fuzzy String Matching

0.2.5 · active · verified Thu Apr 16

fuzzyset2 is a Python library that provides a data structure for performing fuzzy string matching, akin to full-text search. It helps identify likely misspellings and approximate string matches by breaking strings into n-grams and using a reverse index and cosine similarity. It is a maintained fork of the original 'fuzzyset' package, addressing past installation and maintenance issues. The current version is 0.2.5, and it appears to be actively maintained with recent releases.

Common errors

Warnings

Install

Imports

Quickstart

Initialize a FuzzySet and add strings. Use the .get() method to find approximate matches for a query string. The result is a list of (score, matched_value) tuples, where the score indicates similarity between 0 and 1.

from fuzzyset import FuzzySet

# Initialize with an iterable or add strings later
a = FuzzySet(['apple', 'banana', 'orange'])

# Add a new string
a.add('aple')

# Get fuzzy matches
matches = a.get('appel')
print(f"Matches for 'appel': {matches}")

matches = a.get('banan')
print(f"Matches for 'banan': {matches}")

# Access by index (if only one perfect match or for illustration)
# Note: .get() is generally preferred for fuzzy matching
# matches = a['apple'] # This will return a list of (score, value) tuples
# print(matches)

view raw JSON →