cdifflib

1.2.9 · active · verified Thu Apr 16

cdifflib is a Python library that provides a C implementation of parts of Python's standard `difflib` module, specifically focusing on `SequenceMatcher`. It creates a `CSequenceMatcher` type which inherits most functions from `difflib.SequenceMatcher`, offering up to 4x speed improvement when diffing large streams. The current version is 1.2.9, with irregular but ongoing maintenance releases to support newer Python versions and address issues.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to instantiate `CSequenceMatcher` and use its `find_longest_match` and `ratio` methods, similar to `difflib.SequenceMatcher`.

from cdifflib import CSequenceMatcher

# Example 1: Basic sequence matching
s = CSequenceMatcher(None, ' abcd', 'abcd abcd')
match = s.find_longest_match(0, 5, 0, 9)
print(f"Longest match: {match}")

# Example 2: With custom junk filter
s2 = CSequenceMatcher(lambda x: x == " ",
                      "private Thread currentThread;",
                      "private volatile Thread currentThread;")
ratio = round(s2.ratio(), 3)
print(f"Similarity ratio: {ratio}")

view raw JSON →