PyTorch LoRA Library (loralib)
loralib provides a PyTorch implementation of Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method for large deep learning models. It enables adapting models with performance comparable to full fine-tuning while significantly reducing trainable parameters and memory footprint. The library is currently at version 0.1.2 and appears to be actively maintained by Microsoft, though PyPI updates are infrequent, with core development often reflected in GitHub activities like checkpoint releases.
Common errors
-
AttributeError: 'Linear' object has no attribute 'lora_A'
cause Attempting to access LoRA-specific parameters (like `lora_A` or `lora_B`) on a standard `torch.nn.Linear` layer that has not been converted to `loralib.Linear`.fixEnsure the `nn.Linear` layer has been replaced with `loralib.Linear` (or `loralib.Embedding`/`loralib.Conv2d`) and that the LoRA parameters are initialized, typically by calling `loralib.mark_only_lora_as_trainable` or checking the `r` parameter of `loralib.Linear`. -
RuntimeError: element 0 of tensors does not have enough grads for grad_scaler.update()
cause This usually indicates that no parameters in the model have `requires_grad=True`, meaning the optimizer has nothing to update, often due to forgetting to call `loralib.mark_only_lora_as_trainable` or converting the wrong layers.fixAfter converting your desired layers to `loralib` variants, call `loralib.mark_only_lora_as_trainable(model)` to set `requires_grad=True` only for the LoRA-specific parameters and freeze the base model weights. Verify with `for n, p in model.named_parameters(): if p.requires_grad: print(n)`. -
KeyError: 'SomeLayer.weight' when loading a state_dict saved using lora.lora_state_dict(model)
cause You are trying to load a checkpoint containing only LoRA weights (saved with `lora.lora_state_dict()`) into a full model expecting all original weights, or vice versa.fixIf loading LoRA weights, first load the original pre-trained model, then replace relevant layers with `loralib` versions, and then load the LoRA state_dict using `model.load_state_dict(lora_weights, strict=False)`. If saving, ensure you understand if you need the full model or just the LoRA deltas.
Warnings
- gotcha The `loralib` Python package should not be confused with `LoRaLib`, which is an Arduino library for LoRa radio modules. They serve entirely different purposes, and searching broadly for 'LoRa library' can yield irrelevant results.
- gotcha loralib directly supports `nn.Linear`, `nn.Embedding`, and `nn.Conv2d` layers for adaptation. If your model contains other types of layers that you wish to apply LoRA to, you might need to implement custom wrappers or manual adaptation, or consider using other PEFT libraries like Hugging Face's PEFT which may offer broader layer support.
- gotcha When using `loralib`, you still require the original pre-trained model checkpoint to perform inference or further training, as `loralib` only adds low-rank update matrices and does not store the original model weights.
- deprecated Hugging Face's `PEFT` (Parameter-Efficient Fine-Tuning) library now offers robust LoRA implementations and is often recommended for integrating LoRA with Hugging Face Transformers models, potentially providing more comprehensive features and broader model compatibility than the original `loralib` package.
Install
-
pip install loralib
Imports
- loralib as lora
import loralib as lora
- lora.Linear
from torch.nn import Linear
from loralib import Linear
- lora.mark_only_lora_as_trainable
import loralib as lora lora.mark_only_lora_as_trainable(model)
- lora.lora_state_dict
import loralib as lora torch.save(lora.lora_state_dict(model), 'lora_weights.pt')
Quickstart
import torch
import torch.nn as nn
import loralib as lora
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(10, 20)
self.linear2 = nn.Linear(20, 5)
def forward(self, x):
return self.linear2(self.linear1(x))
# 1. Instantiate the base model
base_model = MyModel()
# 2. Convert a layer to its LoRA equivalent
# Replace nn.Linear with lora.Linear, specifying rank 'r'
# Here, we convert linear1 to a LoRA-enabled layer
base_model.linear1 = lora.Linear(base_model.linear1.in_features, base_model.linear1.out_features, r=4)
# (Optional: Convert more layers)
# base_model.linear2 = lora.Linear(base_model.linear2.in_features, base_model.linear2.out_features, r=4)
# 3. Mark only LoRA parameters as trainable
lora.mark_only_lora_as_trainable(base_model)
# Verify trainable parameters
print("Trainable parameters after LoRA conversion:")
for name, param in base_model.named_parameters():
if param.requires_grad:
print(f" {name}: {param.shape}")
# Example usage (forward pass)
input_tensor = torch.randn(1, 10)
output_tensor = base_model(input_tensor)
print(f"Output shape: {output_tensor.shape}")
# 4. Save only the LoRA-specific state_dict
lora_weights = lora.lora_state_dict(base_model)
# torch.save(lora_weights, 'my_model_lora.pt')