pyhacrf-datamade

raw JSON →
0.2.8 verified Fri May 01 auth: no python

Hidden alignment conditional random field (HACRF) for discriminative string edit distance. Current version 0.2.8, infrequent releases.

pip install pyhacrf-datamade
error ValueError: Buffer dtype mismatch, expected 'int64' but got 'int32'
cause Incompatible numpy dtype on Windows (int32 default) vs Cython expecting int64.
fix
Explicitly cast arrays: np.array(arr, dtype=np.int64)
error AttributeError: 'HACRF' object has no attribute 'scores_'
cause The 'scores_' attribute was renamed to 'decision_function' in version 0.2.6.
fix
Use model.decision_function(X) instead of model.scores_
gotcha Input strings must be passed as lists of characters (or tokens), not as raw strings. For example, pass ['a','b','c'] not 'abc'.
fix Convert strings to list: list('abc')
deprecated The version 0.2.6 changed the API: previously used 'model.fit(X, y, ...)' now requires explicit 'model.fit(X, y)'. Some older examples show deprecated arguments like 'verbose'.
fix Remove deprecated arguments; update to current fit signature.
gotcha The HACRF model expects X to be a list of string pairs? Actually, the input X is a list of strings (each string to align). For pairwise alignment, you need to construct paired examples yourself.
fix For pairwise edit distance, pass two lists: X = list(zip(strings1, strings2)), then model.fit(X, y). But ensure each element is a list of characters.
breaking When upgrading from 0.2.4 to 0.2.6, the internal Cython alignment structure changed; models saved with older versions may not load.
fix Retrain models after upgrading.

Minimal example: train HACRF on string pairs and predict.

from pyhacrf import HACRF
import numpy as np

# Example strings as lists of characters
X = [list('abc'), list('abd'), list('abce')]
y = [0, 0, 1]

model = HACRF()
model.fit(X, y)
predictions = model.predict([list('abe')])
print(predictions)