MCT Quantizers

1.7.0 verified Sat May 09 auth: no python

MCT Quantizers is a Python library that provides infrastructure for supporting neural network compression through quantization-aware training and post-training quantization. It is part of the Model Compression Toolkit (MCT) ecosystem. Version 1.7.0 requires Python >=3.10. The library is actively maintained with regular releases.

pip install mct-quantizers

Common errors

error ModuleNotFoundError: No module named 'mct_quantizers' ↓

cause The package is not installed or imported with wrong name.

fix

Install with 'pip install mct-quantizers' and import as 'mct_quantizers' (underscore).

error AttributeError: module 'mct_quantizers' has no attribute 'QuantizationConfig' ↓

cause Trying to import QuantizationConfig directly from top-level while it was moved from a submodule in version 1.5.0.

fix

Use 'from mct_quantizers import QuantizationConfig' (for versions >=1.5.0) or 'from mct_quantizers.quantization import QuantizationConfig' (for older versions).

error RuntimeError: Quantization not supported for operation type ... ↓

cause The model contains an unsupported layer/operation for quantization.

fix

Check the list of supported layers in the documentation. Consider replacing unsupported layers with supported alternatives or skip quantization for those layers.

Warnings

breaking In version 1.5.0, the import path for QuantizationConfig changed from mct_quantizers.quantization to mct_quantizers. Old code will break. ↓

fix Update imports: replace 'from mct_quantizers.quantization import QuantizationConfig' with 'from mct_quantizers import QuantizationConfig'.

deprecated PostTrainingQuantization for TensorFlow models is deprecated since version 1.6.0 and will be removed in future releases. Use PyTorch or ONNX variants instead. ↓

fix If using TensorFlow, migrate your pipeline to PyTorch or use mct-quantizers with ONNX backend.

gotcha When using multi-GPU, the quantization process may fail if the model is not properly wrapped with DataParallel. This is not automatically handled. ↓

fix Wrap your model with torch.nn.DataParallel before passing to PostTrainingQuantization.

Imports

QuantizationConfig

wrong

from mct.quantizers import QuantizationConfig

correct

from mct_quantizers import QuantizationConfig

MCT Quantizers uses underscore in package name, not dot.

PostTrainingQuantization

wrong

from mct_quantizers import PostTrainingQuantization

correct

from mct_quantizers.pytorch import PostTrainingQuantization

PostTrainingQuantization is under framework-specific submodule.

Logger

wrong

import mct_quantizers.logger

correct

from mct_quantizers.logger import Logger

Logger is a class, must be imported explicitly.

Quickstart

Basic usage: quantize a PyTorch model with post-training quantization.

import torch
from mct_quantizers.pytorch import QuantizationConfig, PostTrainingQuantization

# Create a simple model
model = torch.nn.Sequential(torch.nn.Linear(10, 5), torch.nn.ReLU())

# Define quantization configuration
config = QuantizationConfig(n_bits=8, per_channel=True)

# Apply post-training quantization
ptq = PostTrainingQuantization(model, config)
quantized_model = ptq.quantize()

# Save the quantized model
torch.save(quantized_model.state_dict(), 'quantized_model.pth')