{"id":27101,"library":"knnimpute","title":"KNN Impute","description":"A lightweight Python library for k-Nearest Neighbor imputation of missing values in datasets. Current version 0.1.0 appears to be an initial release with minimal updates; last commit on GitHub was in 2018. The library is in maintenance mode.","status":"maintenance","version":"0.1.0","language":"python","source_language":"en","source_url":"https://github.com/hammerlab/knnimpute","tags":["imputation","missing-data","knn","data-cleaning"],"install":[{"cmd":"pip install knnimpute","lang":"bash","label":"pip install knnimpute"}],"dependencies":[{"reason":"Required for array operations.","package":"numpy","optional":false},{"reason":"Required for distance computations.","package":"scipy","optional":false}],"imports":[{"note":"The library itself is called knnimpute, not knn_impute. Attempting to import from knn_impute will raise ModuleNotFoundError.","wrong":"from knn_impute import knn_impute","symbol":"knn_impute","correct":"from knnimpute import knn_impute"},{"note":"Used when few observed values per feature.","wrong":"","symbol":"knn_impute_few_observed","correct":"from knnimpute import knn_impute_few_observed"}],"quickstart":{"code":"import numpy as np\nfrom knnimpute import knn_impute\n\n# Simulate data with missing values\nX = np.array([[1, 2, np.nan], [4, np.nan, 6], [7, 8, 9]])\n\n# Impute using k=3\nX_imputed = knn_impute(X, k=3)\nprint(X_imputed)\n# Output: [[1. 2. 5.] [4. 5. 6.] [7. 8. 9.]]","lang":"python","description":"Basic imputation of a matrix with missing values using default settings."},"warnings":[{"fix":"Use knn_impute for standard tasks; only use knn_impute_few_observed when you have many features with few observed values.","message":"The function knn_impute_few_observed is not a drop-in replacement for knn_impute; it uses a different algorithm and signature. Ensure you read its documentation.","severity":"breaking","affected_versions":">=0.1.0"},{"fix":"Convert categorical variables to numeric using one-hot encoding or label encoding before calling knn_impute.","message":"The library does not handle non-numeric data. All columns must be numeric; categorical data must be encoded beforehand.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"Ensure missing entries are np.nan. Use np.isnan() to check or convert None to np.nan.","message":"Missing values must be represented as np.nan. Using None or other sentinels will cause incorrect behavior or errors.","severity":"gotcha","affected_versions":">=0.1.0"}],"env_vars":null,"last_verified":"2026-05-01T00:00:00.000Z","next_check":"2026-07-30T00:00:00.000Z","problems":[{"fix":"Install with 'pip install knnimpute' and import with 'from knnimpute import knn_impute'.","cause":"Incorrect import statement; the package is knnimpute, not knn_impute.","error":"ModuleNotFoundError: No module named 'knn_impute'"},{"fix":"Check that the matrix does not contain inf values. Replace inf with np.nan before imputation: X[np.isinf(X)] = np.nan","cause":"The input matrix still contains np.nan after imputation? Or the data has infinite values, which are not handled.","error":"ValueError: Input contains NaN, infinity or a value too large for dtype('float64')."},{"fix":"Ensure k is an integer: k=int(k) or pass k=3 (not 3.0).","cause":"The k parameter might have been passed as a float (e.g., k=3.0).","error":"TypeError: 'numpy.float64' object cannot be interpreted as an integer"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}