pyhacrf-datamade
raw JSON → 0.2.8 verified Fri May 01 auth: no python
Hidden alignment conditional random field (HACRF) for discriminative string edit distance. Current version 0.2.8, infrequent releases.
pip install pyhacrf-datamade Common errors
error ValueError: Buffer dtype mismatch, expected 'int64' but got 'int32' ↓
cause Incompatible numpy dtype on Windows (int32 default) vs Cython expecting int64.
fix
Explicitly cast arrays: np.array(arr, dtype=np.int64)
error AttributeError: 'HACRF' object has no attribute 'scores_' ↓
cause The 'scores_' attribute was renamed to 'decision_function' in version 0.2.6.
fix
Use model.decision_function(X) instead of model.scores_
Warnings
gotcha Input strings must be passed as lists of characters (or tokens), not as raw strings. For example, pass ['a','b','c'] not 'abc'. ↓
fix Convert strings to list: list('abc')
deprecated The version 0.2.6 changed the API: previously used 'model.fit(X, y, ...)' now requires explicit 'model.fit(X, y)'. Some older examples show deprecated arguments like 'verbose'. ↓
fix Remove deprecated arguments; update to current fit signature.
gotcha The HACRF model expects X to be a list of string pairs? Actually, the input X is a list of strings (each string to align). For pairwise alignment, you need to construct paired examples yourself. ↓
fix For pairwise edit distance, pass two lists: X = list(zip(strings1, strings2)), then model.fit(X, y). But ensure each element is a list of characters.
breaking When upgrading from 0.2.4 to 0.2.6, the internal Cython alignment structure changed; models saved with older versions may not load. ↓
fix Retrain models after upgrading.
Imports
- HACRF wrong
from pyhacrf.hacrf import HACRFcorrectfrom pyhacrf import HACRF
Quickstart
from pyhacrf import HACRF
import numpy as np
# Example strings as lists of characters
X = [list('abc'), list('abd'), list('abce')]
y = [0, 0, 1]
model = HACRF()
model.fit(X, y)
predictions = model.predict([list('abe')])
print(predictions)