{"id":9296,"library":"scann","title":"Scann","description":"ScaNN (Scalable Nearest Neighbors) is a library by Google Research for efficient vector similarity search at scale, implementing techniques like search space pruning and quantization. It offers both Python and TensorFlow APIs and is known for its speed and scalability with large datasets. The current version is 1.4.2, actively maintained, and released through PyPI.","status":"active","version":"1.4.2","language":"en","source_language":"en","source_url":"https://github.com/google-research/google-research/tree/master/scann","tags":["nearest-neighbor","similarity-search","machine-learning","vector-search","google-research","approximate-nearest-neighbor"],"install":[{"cmd":"pip install scann","lang":"bash","label":"Base installation"},{"cmd":"pip install scann[tf]","lang":"bash","label":"With TensorFlow integration"}],"dependencies":[{"reason":"Requires Python >=3.9, <3.14.","package":"Python","optional":false},{"reason":"Requires libstdc++ version 3.4.23 or above from the operating system for manylinux_2_27 wheels.","package":"libstdc++","optional":false},{"reason":"Optional for TensorFlow op bindings, installed via `scann[tf]`.","package":"tensorflow","optional":true}],"imports":[{"note":"Main module for ScaNN's Python API.","symbol":"scann","correct":"import scann"},{"note":"Accesses the native Python bindings for ScaNN, used for building searchers without TensorFlow dependencies.","symbol":"scann_ops_pybind","correct":"import scann\nsearcher = scann.scann_ops_pybind.builder(...).build()"},{"note":"Accesses the TensorFlow op bindings for ScaNN. Requires `scann[tf]` to be installed. The `ScannBuilder` class is typically accessed via `scann.scann_ops.builder()` or `scann.scann_ops_pybind.builder()`.","wrong":"from scann import ScannBuilder","symbol":"scann_ops","correct":"import scann\nsearcher = scann.scann_ops.builder(...).build()"}],"quickstart":{"code":"import numpy as np\nimport scann\n\n# 1. Prepare your dataset (e.g., embeddings)\n# For demonstration, creating a random dataset of 1000 vectors, 128 dimensions each.\ndataset = np.random.rand(1000, 128).astype(np.float32)\n\n# 2. Build the ScaNN searcher\n# This example uses dot product distance for Maximum Inner Product Search (MIPS).\n# num_leaves: Number of leaves in the tree for partitioning.\n# num_leaves_to_search: Number of leaves to search at query time.\n# anisotropic_quantization_threshold: Parameter for Anisotropic Vector Quantization.\n\nsearcher = scann.scann_ops_pybind.builder(\n    dataset,\n    num_neighbors=10, # Number of nearest neighbors to retrieve\n    distance_measure=\"dot_product\"\n).tree(\n    num_leaves=100,\n    num_leaves_to_search=10\n).score_ah(\n    dimensions_per_block=2, # Recommended for MIPS\n    anisotropic_quantization_threshold=0.2\n).reorder(\n    100 # Rescore top 100 candidates to improve accuracy\n).build()\n\n# 3. Define a query vector\nquery = np.random.rand(128).astype(np.float32)\n\n# 4. Perform a search\nneighbors, distances = searcher.search(query)\n\nprint(f\"Query vector shape: {query.shape}\")\nprint(f\"Dataset shape: {dataset.shape}\")\nprint(f\"Found {len(neighbors)} neighbors: {neighbors}\")\nprint(f\"Corresponding distances: {distances}\")","lang":"python","description":"This quickstart demonstrates how to create a ScaNN searcher with a sample dataset, configure it for Maximum Inner Product Search (MIPS) using tree partitioning and anisotropic quantization, and then perform a similarity search. It uses the native Python API (`scann.scann_ops_pybind`) which does not require TensorFlow. The example generates random data for simplicity, but in a real application, `dataset` would be your actual high-dimensional vectors (e.g., embeddings)."},"warnings":[{"fix":"Always check `requires_python` on PyPI and consult `docs/releases.md` on GitHub for specific version compatibility with Python and TensorFlow. Upgrade or downgrade your Python/TensorFlow environment as needed.","message":"ScaNN has strict Python version requirements and specific TensorFlow version compatibility. For example, ScaNN 1.2.0 dropped Python 3.5 support and was built against TensorFlow 2.4.0, making it incompatible with TensorFlow 2.3.x. More recent versions (e.g., 1.4.x) support Python 3.9-3.13.","severity":"breaking","affected_versions":"<1.4.x"},{"fix":"If you intend to use ScaNN's TensorFlow ops (e.g., for SavedModels), install with `pip install scann[tf]`.","message":"As of ScaNN 1.4.0, TensorFlow op bindings are no longer enabled by default. `pip install scann` will *not* include TensorFlow integration.","severity":"breaking","affected_versions":">=1.4.0"},{"fix":"Ensure your system's CPU supports the required instruction sets and `libstdc++` meets the version requirement. If not, consider building from source or using a compatible environment (e.g., Docker for Linux environments on incompatible systems).","message":"ScaNN wheels have system-level dependencies. x86 wheels require AVX and FMA instruction set support, while ARM wheels require NEON. Additionally, `manylinux_2_27` compatible wheels require `libstdc++` version 3.4.23 or above.","severity":"gotcha","affected_versions":"All"},{"fix":"Update your code to use the modern builder pattern: `scann.scann_ops_pybind.builder(...).build()`.","message":"The `ScannBuilder` API underwent changes in version 1.1.0. Rather than calling `create_tf` or `create_pybind` directly on a `ScannBuilder` object, you now use the `builder()` method from `scann_ops` or `scann_ops_pybind` to get a `ScannBuilder` object, and then call `build()` on it.","severity":"deprecated","affected_versions":"<1.1.0"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Check your Python version (`python --version`) and ensure it's compatible. Upgrade pip (`pip install --upgrade pip`). If on macOS/Windows, consider using a Linux environment (e.g., Docker, WSL) or building from source.","cause":"This error typically occurs if your Python version is not within the supported range (e.g., <3.9 or >=3.14 for current versions), if your operating system architecture is not supported by available wheels (e.g., macOS or Windows without WSL), or if pip is outdated.","error":"ERROR: Could not find a version that satisfies the requirement scann (from versions: none)\nERROR: No matching distribution found for scann"},{"fix":"Ensure you are passing the `dataset` (a numpy array), `num_neighbors` (an integer), and `distance_measure` (a string) to `builder()`: `scann.scann_ops_pybind.builder(dataset, num_neighbors=10, distance_measure=\"dot_product\")`.","cause":"This indicates an incorrect call to the `builder()` method, usually from `scann.scann_ops_pybind` or `scann.scann_ops`. The `builder()` method requires the dataset, number of neighbors to retrieve, and the distance measure (e.g., 'dot_product', 'squared_l2') as its initial arguments.","error":"TypeError: builder() missing 3 required positional arguments: 'db', 'num_neighbors', and 'distance_measure'"},{"fix":"Ensure that the `dataset` array passed to the ScaNN builder is not empty and contains valid embedding vectors before attempting to build the searcher.","cause":"The ScaNN builder (or related indexing functions) was provided with an empty dataset or a dataset that became empty after filtering. ScaNN requires data to build its index.","error":"ERROR: Cannot create ScaNN index with empty table."}]}