{"id":5957,"library":"hnswlib","title":"Hnswlib","description":"Hnswlib is a lightweight, header-only C++ library with Python bindings designed for fast Approximate Nearest Neighbor (ANN) search. It implements the Hierarchical Navigable Small Worlds (HNSW) algorithm, enabling efficient similarity search in high-dimensional vector spaces. The library supports dynamic updates (insertion and deletion of elements) and various distance metrics like L2, Inner Product, and Cosine similarity. Its current stable release on PyPI is 0.8.0, with version 0.9.0 recently released on GitHub, and it maintains a relatively active release cadence.","status":"active","version":"0.8.0","language":"en","source_language":"en","source_url":"https://github.com/nmslib/hnswlib","tags":["vector database","similarity search","nearest neighbors","ann","embedding"],"install":[{"cmd":"pip install hnswlib","lang":"bash","label":"Install from PyPI"}],"dependencies":[{"reason":"Required for handling vector data (NumPy arrays) in Python bindings.","package":"numpy","optional":false}],"imports":[{"symbol":"Index","correct":"import hnswlib\nimport numpy as np\n\nindex = hnswlib.Index(space='l2', dim=128)"}],"quickstart":{"code":"import hnswlib\nimport numpy as np\nimport os\n\n# Define data parameters\ndim = 128\nnum_elements = 10000\n\n# Generate random data\ndata = np.float32(np.random.random((num_elements, dim)))\ndata_labels = np.arange(num_elements)\n\n# Initialize the HNSW index\n# Possible space options: 'l2', 'ip' (inner product), 'cosine'\nspace_name = 'l2' # Euclidean distance\nindex = hnswlib.Index(space=space_name, dim=dim)\n\n# Set index parameters BEFORE adding data\n# max_elements: current capacity\n# ef_construction: accuracy vs. construction speed trade-off\n# M: number of bi-directional links per data point\nindex.init_index(max_elements=num_elements, ef_construction=200, M=16)\n\n# Add items to the index\nindex.add_items(data, data_labels)\n\n# Set query time accuracy/speed trade-off\n# Note: This parameter is NOT saved with the index and must be set after loading.\nindex.set_ef(50)\n\n# Generate a query vector\nquery_vector = np.float32(np.random.random((1, dim)))\n\n# Perform a k-nearest neighbor query\nk = 5\nlabels, distances = index.knn_query(query_vector, k=k)\n\nprint(f\"Query vector: {query_vector[0][:5]}...\")\nprint(f\"Nearest neighbor labels: {labels[0]}\")\nprint(f\"Distances to neighbors: {distances[0]}\")\n\n# Example of saving and loading the index\nindex_path = 'my_hnsw_index.bin'\nindex.save_index(index_path)\n\nloaded_index = hnswlib.Index(space=space_name, dim=dim)\nloaded_index.load_index(index_path)\nloaded_index.set_ef(50) # Re-set ef after loading\n\nloaded_labels, loaded_distances = loaded_index.knn_query(query_vector, k=k)\nprint(f\"Loaded index nearest neighbor labels: {loaded_labels[0]}\")\n\nos.remove(index_path)","lang":"python","description":"This quickstart demonstrates how to create an HNSW index, initialize it with parameters, add vector data, perform a k-nearest neighbor query, and then save and load the index. It highlights the importance of re-setting the `ef` parameter after loading the index."},"warnings":[{"fix":"Rebuild indices with a supported `hnswlib` version. Consider exporting data and re-importing if migration is critical.","message":"Indices saved with very old versions (prior to v0.3.4) are not supported and cannot be loaded with newer `hnswlib` versions.","severity":"breaking","affected_versions":"< 0.3.4"},{"fix":"Upgrade to `hnswlib` v0.6.2 or later to prevent corruption of large pickled indices. Rebuild any potentially corrupted indices.","message":"Saving and loading of large pickled indices (greater than 4GB) in versions prior to 0.6.2 could lead to data corruption.","severity":"breaking","affected_versions":"< 0.6.2"},{"fix":"Ensure that the `hnswlib` library is compiled and indices are used on machines with compatible CPU architectures and instruction sets. Recompile without specific AVX flags if maximum portability is needed, or rebuild indices on the target architecture.","message":"Indices built with AVX512 or AVX optimizations (enabled during compilation) in `hnswlib` v0.6.1 and later may not be backwards-compatible with older SSE or non-AVX512 architectures. This can cause issues when moving indices between machines with different CPU capabilities.","severity":"breaking","affected_versions":">= 0.6.1"},{"fix":"Always call `index.set_ef(value)` after loading an index to configure the desired query performance.","message":"The `ef` parameter, which controls the query-time accuracy/speed trade-off, is *not* saved as part of the index. It must be manually set after loading a saved index.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Review multi-threaded search logic; if statistic aggregation was implicitly relied upon, evaluate impact. Check release notes for explicit alternatives if available.","message":"In `hnswlib` v0.8.0, statistic aggregation was removed by default for multi-threaded search to improve speed. Users who relied on this feature might observe changes in behavior or require explicit configuration if it's still needed.","severity":"breaking","affected_versions":">= 0.8.0"},{"fix":"Upgrade to `hnswlib` v0.9.0 or later once released on PyPI for correct filter behavior and robust error handling when `k` exceeds available elements.","message":"When performing brute-force searches with filters, versions prior to 0.9.0 (currently on GitHub, not yet PyPI stable) contained bugs that could lead to incorrect results or missing normalization checks. Additionally, searching for `k` elements when fewer than `k` are available now explicitly throws an exception.","severity":"gotcha","affected_versions":"< 0.9.0"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}