{"id":23478,"library":"cumm-cu126","title":"cuMM: CUda Matrix Multiply Library","description":"cuMM is a high-performance CUDA matrix multiplication library designed for deep learning and scientific computing. It provides optimized GEMM (General Matrix Multiply) kernels and supports various precision formats. Version 0.8.2 requires Python >=3.8 and is actively maintained.","status":"active","version":"0.8.2","language":"python","source_language":"en","source_url":"https://github.com/FindDefinition/cumm","tags":["cuda","matrix-multiplication","gemm","gpu","deep-learning"],"install":[{"cmd":"pip install cumm-cu126","lang":"bash","label":"PyPI"}],"dependencies":[],"imports":[{"note":"cumm-cu126 is the package name on PyPI, but the import module is 'cumm'.","wrong":"import cumm-cu126","symbol":"cumm","correct":"import cumm"},{"note":"The module name does not include the CUDA version suffix.","wrong":"from cumm_cu126 import functional","symbol":"cumm.functional","correct":"from cumm import functional"}],"quickstart":{"code":"import cumm\nimport torch\nx = torch.randn(128, 128, device='cuda')\ny = torch.randn(128, 128, device='cuda')\nz = cumm.gemm(x, y)\nprint(z.shape)","lang":"python","description":"Basic GEMM operation using cuMM with PyTorch tensors."},"warnings":[{"fix":"Ensure your system has CUDA 12.6 installed and set LD_LIBRARY_PATH appropriately.","message":"cuMM requires a compatible CUDA toolkit (CUDA 12.6) and NVIDIA GPU drivers. Running on an unsupported CUDA version may cause import errors or runtime crashes.","severity":"breaking","affected_versions":"all"},{"fix":"Use 'import cumm' instead of 'import cumm-cu126'.","message":"The library name on PyPI is 'cumm-cu126', but the Python module to import is simply 'cumm'. Do not use the PyPI name in import statements.","severity":"gotcha","affected_versions":"all"},{"fix":"Upgrade to 0.8.2 and replace cumm.gemm_xx with cumm.gemm.","message":"cuMM versions before 0.7.0 used a different API with explicit gemm_ functions. The new API uses cumm.gemm directly.","severity":"deprecated","affected_versions":"<0.7.0"}],"env_vars":null,"last_verified":"2026-05-01T00:00:00.000Z","next_check":"2026-07-30T00:00:00.000Z","problems":[{"fix":"Run 'pip install cumm-cu126' and ensure you use 'import cumm' (no hyphen). Check that CUDA toolkit 12.6 is available.","cause":"Installed package 'cumm-cu126' but Python cannot find the module due to missing dependencies or incorrect import. Also, the module name is exactly 'cumm' (no hyphen).","error":"ModuleNotFoundError: No module named 'cumm'"},{"fix":"Use a supported GPU (e.g., NVIDIA Ampere, Ada Lovelace, Hopper) or rebuild cuMM from source with the appropriate architecture flags.","cause":"The GPU architecture is not supported by the precompiled kernels in cuMM. cuMM ships kernels for specific compute capabilities (e.g., sm_80, sm_86, sm_89, sm_90). Older or newer GPUs may not have a matching kernel.","error":"RuntimeError: CUDA error: no kernel image is available for execution on the device"},{"fix":"Install CUDA 12.6 toolkit and add its lib64 directory to LD_LIBRARY_PATH.","cause":"CUDA runtime library (libcudart.so.12) is not installed or not in the library path.","error":"ImportError: libcudart.so.12: cannot open shared object file: No such file or directory"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}