ProdigyPlusScheduleFree

raw JSON →
2.0.1 verified Fri May 01 auth: no python

Automatic learning rate optimizer combining Prodigy's adaptive LR with Schedule-Free's constant-parameter interpolation. Version 2.0.1 improved weight decay handling. Active development.

pip install prodigy-plus-schedule-free
error ImportError: cannot import name 'ProdigyPlusScheduleFree' from 'prodigyplus'
cause Old import pattern (prodigyplus vs prodigy_plus_schedule_free).
fix
Use: from prodigy_plus_schedule_free import ProdigyPlusScheduleFree
error RuntimeError: Expected all tensors to be on the same device, but found at least two devices
cause Using default behavior without specifying device; optimizer might not handle device placement automatically.
fix
Create optimizer after moving model to device: model.to(device); optimizer = ProdigyPlusScheduleFree(model.parameters(), lr=1.0)
error TypeError: 'NoneType' object is not callable
cause Calling optimizer.step() without optimizer.zero_grad() or without setting model to train mode.
fix
Ensure optimizer.train() is called before the loop and zero_grad() before backward.
breaking In v2.0.0, the import path changed from 'prodigyplus_schedulefree' to 'prodigy_plus_schedule_free'. Old imports will break.
fix Update import: 'from prodigy_plus_schedule_free import ProdigyPlusScheduleFree'.
gotcha You must call .train() at the start of each training loop and .eval() for evaluation to ensure correct parameter interpolation.
fix Always switch modes: optimizer.train() before training, optimizer.eval() before inference.
deprecated Parameter 'weight_decay' had a bug in v1.x where it was applied incorrectly. Use v2.0.1+ for correct weight decay.
fix Upgrade to >=2.0.1. If you cannot upgrade, avoid using weight_decay or implement manually.

Basic usage: instantiate optimizer, call .train() before training loop, step normally.

import torch
from prodigy_plus_schedule_free import ProdigyPlusScheduleFree

model = torch.nn.Linear(10, 2)
optimizer = ProdigyPlusScheduleFree(model.parameters(), lr=1.0)
optimizer.train()
for data, target in [(torch.randn(10), torch.tensor(1))]:
    optimizer.zero_grad()
    loss = torch.nn.functional.cross_entropy(model(data), target.unsqueeze(0))
    loss.backward()
    optimizer.step()