PEFT

0.18.1 · active · verified Thu Mar 26

Hugging Face Parameter-Efficient Fine-Tuning library. LoRA, QLoRA, LoHa, IA3, prompt tuning and more. Current version is 0.18.1 (Jan 2026). Requires Python >=3.10. PEFT <0.18.0 is incompatible with Transformers v5.

Warnings

Install

Imports

Quickstart

LoRA fine-tuning on all linear layers. Save adapter only — not the full model.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType, prepare_model_for_kbit_training
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    'meta-llama/Llama-3.2-1B',
    torch_dtype=torch.bfloat16,
    device_map='auto'
)

# Configure LoRA
config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules='all-linear',  # applies to all linear layers (QLoRA style)
    lora_dropout=0.05,
    bias='none',
    task_type=TaskType.CAUSAL_LM
)

model = get_peft_model(model, config)
model.print_trainable_parameters()
# trainable params: 6,815,744 || all params: 1,242,343,424 || trainable%: 0.55

# After training — save adapter only:
model.save_pretrained('lora_adapter/')

# Reload for inference:
base = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-3.2-1B', torch_dtype=torch.bfloat16)
from peft import PeftModel
peft_model = PeftModel.from_pretrained(base, 'lora_adapter/')

view raw JSON →