NMSLIB (Non-Metric Space Library)

2.1.2 · active · verified Thu Apr 16

NMSLIB (Non-Metric Space Library) is an efficient cross-platform similarity search library with Python bindings, designed for approximate nearest neighbor search in high-dimensional and non-metric spaces. It is widely used for vector similarity tasks, often in conjunction with embeddings from libraries like Gensim. The current version is 2.1.2, and the library receives updates focused on performance, new features, and broader platform/Python support.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize an HNSW index, add a batch of data points (random float32 vectors), build the index, and perform a k-nearest neighbor query. It uses `numpy` for data generation and `nmslib` for indexing and querying. The `post` parameter in `createIndex` is a common HNSW tuning parameter.

import nmslib
import numpy

# Create a random matrix to index
data = numpy.random.randn(10000, 100).astype(numpy.float32)

# Initialize a new index using HNSW on Cosine Similarity
index = nmslib.init(method='hnsw', space='cosinesimil')
index.addDataPointBatch(data)
index.createIndex({'post': 2}, print_progress=True)

# Query for the nearest neighbours of the first datapoint
ids, distances = index.knnQuery(data[0], k=10)
print(f"Nearest neighbors for data[0]: {ids}, distances: {distances}")

# Get all nearest neighbours for all the datapoints using multiple threads
# neighbours = index.knnQueryBatch(data, k=10, num_threads=4)
# print(f"Batch query results (first entry): {neighbours[0]}")

view raw JSON →