Intel oneAPI Collective Communications Library (oneCCL)

raw JSON →
2022.0.0 verified Fri May 01 auth: no python

Intel oneAPI Collective Communications Library (oneCCL) provides an efficient implementation of communication patterns used in deep learning. Version 2022.0.0 is distributed as a runtime environment package for Python, typically used with Intel optimizations for distributed training. Release cadence is tied to Intel oneAPI annual releases.

pip install oneccl
error ImportError: No module named 'oneccl_bindings'
cause The 'oneccl' pip package does not include the Python bindings; the bindings need to be installed separately.
fix
Install the bindings: conda install -c intel oneccl_bindings_pt (for PyTorch) or use the Intel AI Kit.
error RuntimeError: Distributed package doesn't have NCCL built-in
cause Confusion between NCCL and CCL backends; PyTorch does not include CCL backend by default – it requires explicit import of oneccl_bindings.
fix
Add 'import oneccl_bindings' before calling init_process_group with backend='ccl'.
error ValueError: Invalid backend: 'ccl'
cause oneCCL backend is not registered because oneccl_bindings was not imported.
fix
Import oneccl_bindings: import oneccl_bindings as ccl before initializing the process group.
gotcha The pip package 'oneccl' is a runtime environment (no Python code). You must install 'oneccl_bindings' separately or use the Intel conda channel to get the actual bindings.
fix Install oneccl_bindings via conda: conda install -c intel oneccl_bindings_pt (for PyTorch) or use the full Intel AI Kit.
breaking oneCCL 2022.0.0 changed backend initialization: the backend name was renamed from 'ccl' to 'ccl' (still 'ccl') but the package structure changed. Old imports from 'oneccl_bindings_for_pytorch' no longer exist.
fix Use 'import oneccl_bindings' and set backend='ccl' in torch.distributed.init_process_group.
deprecated The older 'oneccl_bindings_for_pytorch' package is deprecated; use the unified 'oneccl_bindings'.
fix Replace 'import oneccl_bindings_for_pytorch' with 'import oneccl_bindings'.
gotcha oneCCL is only supported on Linux and requires MPI or a compatible runtime (e.g., Intel MPI). Windows users may encounter DLL errors.
fix Use Linux (preferably with Intel MPI installed). For Windows, consider WSL2.

Initializes a PyTorch distributed process group using the CCL backend (oneCCL). Requires environment variables RANK, WORLD_SIZE, MASTER_ADDR, MASTER_PORT to be set.

import torch.distributed as dist
import oneccl_bindings as ccl
import os

# Initialize process group with oneCCL backend
dist.init_process_group(
    backend='ccl',
    init_method='env://',
    rank=int(os.environ.get('RANK', 0)),
    world_size=int(os.environ.get('WORLD_SIZE', 1))
)

if dist.get_rank() == 0:
    print("oneCCL initialized successfully")