Accelerate

1.13.0 verified Tue May 12 auth: no python install: stale quickstart: stale

Hugging Face library to run PyTorch training across any distributed configuration with minimal code changes. Current version is 1.13.0 (Mar 2026). Requires Python >=3.10. Core pattern: Accelerator() + accelerator.prepare() + accelerator.backward(). Must run accelerate config before first use.

pip install accelerate

Common errors

error ModuleNotFoundError: No module named 'accelerate' ↓

cause The 'accelerate' library is not installed or not accessible in the current Python environment.

fix

Install the library using 'pip install accelerate'.

error bash: accelerate: command not found ↓

cause The 'accelerate' command-line tool is not found, possibly due to installation issues or PATH misconfiguration.

fix

Ensure 'accelerate' is installed and accessible by checking the installation path and verifying the PATH environment variable.

error ImportError: cannot import name 'partialstate' from 'accelerate' ↓

cause Attempting to import a non-existent 'partialstate' module from the 'accelerate' library.

fix

Verify the correct module name and import statement; refer to the 'accelerate' documentation for accurate usage.

error AttributeError: module 'openai' has no attribute 'ChatCompletion' ↓

cause The 'openai' module does not have an attribute named 'ChatCompletion', possibly due to an outdated version or incorrect import.

fix

Update the 'openai' library to the latest version and check the documentation for the correct usage of 'ChatCompletion'.

error TypeError: 'NoneType' object is not iterable ↓

cause An operation is attempting to iterate over a 'None' object, indicating that a variable expected to be iterable is 'None'.

fix

Ensure that the variable is properly initialized and assigned an iterable value before iteration.

Warnings

breaking accelerate config must be run before first use. Without a config file, Accelerate falls back to single-process CPU mode silently — multi-GPU training simply won't use multiple GPUs. ↓

fix Run accelerate config once after install, or programmatically: from accelerate.utils import write_basic_config; write_basic_config(). For CI: set ACCELERATE_CONFIG_FILE env var pointing to a pre-built config.

breaking Python 3.9 support dropped in 1.13.0. Accelerate now requires Python >=3.10. ↓

fix Upgrade Python to 3.10+. Pin accelerate<1.13.0 for Python 3.9 environments.

breaking Accelerator() initialized outside the training function raises ValueError when using notebook_launcher for multi-GPU. Silently falls back to 1 GPU without error if no notebook_launcher is used. ↓

fix Always initialize Accelerator() inside the training function passed to notebook_launcher. Never create it at notebook cell level or module level when using multi-GPU in notebooks.

breaking accelerator.load_state() fails with PyTorch 2.6+ due to torch.load weights_only=True default flip. Optimizer states with custom objects (omegaconf.ListConfig, etc.) raise UnpicklingError. ↓

fix Use torch.serialization.add_safe_globals([ListConfig]) to allowlist custom types, or pass weights_only=False to the underlying load call if the checkpoint source is trusted.

breaking DeepSpeed integration: only one nn.Module per Accelerator instance is supported. Passing multiple models to accelerator.prepare() with DeepSpeed raises AssertionError. ↓

fix With DeepSpeed, create a separate Accelerator instance per model, or merge models before wrapping.

gotcha accelerate launch ignores Python script argument ordering. Flags intended for the script must come after --, otherwise they are parsed as accelerate launch flags. ↓

fix Use: accelerate launch script.py --my-arg value. If ambiguous: accelerate launch -- script.py --my-arg value.

gotcha loss.backward() instead of accelerator.backward(loss) silently bypasses mixed precision gradient scaling. Training proceeds but gradients are wrong under fp16/bf16 — numerical instability or NaN loss. ↓

fix Replace all loss.backward() calls with accelerator.backward(loss) throughout the training loop.

breaking Installation of core dependencies like numpy fails due to missing C compilers in the environment, particularly common in minimal Docker images (e.g., Alpine). This prevents packages requiring compilation from being built from source. ↓

fix Ensure that build-essential tools, including a C compiler (e.g., gcc, g++), are installed in your environment before attempting to install Python packages that require compilation. For Alpine-based images, this typically involves `apk add build-base python3-dev`.

Install

accelerate config

python -c "from accelerate.utils import write_basic_config; write_basic_config(mixed_precision='fp16')"

Install compatibility stale last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) - - - -

3.10 slim (glibc) - - 4.00s 4.7G

3.11 alpine (musl) - - - -

3.11 slim (glibc) - - 6.93s 4.7G

3.12 alpine (musl) - - - -

3.12 slim (glibc) - - 7.68s 4.7G

3.13 alpine (musl) - - - -

3.13 slim (glibc) - - 5.69s 4.7G

3.9 alpine (musl) - - - -

3.9 slim (glibc) - - - -

Imports

Accelerator

wrong

# Module-level Accelerator initialization breaks notebook_launcher multi-GPU
accelerator = Accelerator()  # at top of notebook cell

def training_function():
    # ValueError: Accelerator should only be initialized inside your training function

correct

from accelerate import Accelerator

def training_function():
    # Accelerator MUST be initialized inside the training function for notebook_launcher
    accelerator = Accelerator(mixed_precision='fp16')
    model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
    
    for batch in dataloader:
        optimizer.zero_grad()
        loss = model(batch)
        accelerator.backward(loss)  # NOT loss.backward()
        optimizer.step()

For notebook_launcher (Colab/Jupyter multi-GPU), Accelerator() must be initialized INSIDE the training function, never at module/notebook level.

accelerator.backward
wrong
```
loss = criterion(outputs, targets)
loss.backward()  # bypasses mixed precision scaling and gradient accumulation handling
```
correct
```
loss = criterion(outputs, targets)
accelerator.backward(loss)
```
Always use accelerator.backward(loss) instead of loss.backward(). Direct loss.backward() bypasses Accelerate's mixed precision gradient scaling and gradient accumulation logic.

Quickstart stale last tested: 2026-04-23

Core Accelerate pattern. Run with: accelerate launch train.py

from accelerate import Accelerator
import torch
import torch.nn as nn

def train():
    accelerator = Accelerator(mixed_precision='bf16')
    
    model = nn.Linear(10, 1)
    optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)
    dataloader = ...  # your DataLoader
    
    # prepare() handles device placement and distributed wrapping
    model, optimizer, dataloader = accelerator.prepare(
        model, optimizer, dataloader
    )
    
    model.train()
    for batch in dataloader:
        optimizer.zero_grad()
        outputs = model(batch['input'])
        loss = nn.functional.mse_loss(outputs, batch['target'])
        accelerator.backward(loss)  # not loss.backward()
        optimizer.step()
    
    # Save on main process only
    accelerator.wait_for_everyone()
    if accelerator.is_main_process:
        accelerator.save_model(model, 'output/')