ms-swift

raw JSON →
4.1.3 verified Fri May 01 auth: no python

A scalable lightweight infrastructure for fine-tuning large language models (LLMs), vision-language models (VLMs), and embedding models. Supports LoRA, QLoRA, full fine-tuning, and reinforcement learning methods like DPO, GRPO, and PPO. Version 4.1.3 is the latest, with releases every few weeks.

pip install ms-swift
error ImportError: cannot import name 'SwiftTrainer' from 'swift'
cause The import path changed in v4.0.0.
fix
Use from swift.trainers import SwiftTrainer instead.
error ModuleNotFoundError: No module named 'swift'
cause The package `ms-swift` is installed but the module is `swift`. Some users try `import ms_swift` or `import msswift`.
fix
Install with pip install ms-swift and import as import swift.
error AttributeError: 'LoRAConfig' object has no attribute 'target_modules'
cause Very old version (pre-1.0) used a different config structure. Or you are using `from swift.tuners import LoRAConfig` which might be deprecated.
fix
Use from swift import LoRAConfig and pass target_modules as a list of strings.
error RuntimeError: Expected a 'PeftModel' but got 'SwiftModel'
cause ms-swift wraps PEFT models in its own SwiftModel. Some utility functions expect the underlying model.
fix
Access the underlying model via model.model if needed, or use ms-swift's own training loop.
breaking In v4.0.0, the package structure was refactored. SwiftTrainer moved from `swift` to `swift.trainers`, and many utility functions moved to `swift.utils`. Code written for v3.x will break on v4.x without import adjustments.
fix Update imports: `from swift.trainers import SwiftTrainer`, `from swift.utils import get_dataset`. Use `from swift import LoRAConfig` instead of `from swift.tuners`.
deprecated The `Megatron` integration (Megatron-SWIFT) has been extracted to a separate repository `mcore-bridge` in v4.1.0. Future updates for Megatron training will be there.
fix If you rely on Megatron, use the new repo: https://github.com/modelscope/mcore-bridge
gotcha When using `Swift.prepare_model`, the `target_modules` parameter in LoRAConfig must match actual module names in the model. Common mistake: using default target_modules that may not exist, leading to no LoRA applied silently.
fix Check model module names via `model.named_modules()` and set `target_modules` accordingly. For most models, `["q_proj", "v_proj", "k_proj", "o_proj"]` works.
gotcha ms-swift requires Python >=3.8. However, some features (like GRPO) may need Python >=3.10 due to newer dependency versions. Using Python 3.8 may cause silent failures or missing optimizations.
fix Use Python 3.10 or higher for full functionality.
pip install 'ms-swift[llm]'
pip install 'ms-swift[all]'

Basic LoRA fine-tuning with Qwen2.5. Demonstrates correct imports and usage.

from swift import Swift, SwiftModel, LoRAConfig
import os

# Load a pre-trained model and apply LoRA
model = SwiftModel.from_pretrained("Qwen/Qwen2.5-1.5B")
lora_config = LoRAConfig(r=8, lora_alpha=32, target_modules=["q_proj", "v_proj"])
model = Swift.prepare_model(model, lora_config)

# Quick training snippet (simplified)
from swift.trainers import Seq2SeqTrainer
from swift.utils import get_dataset
train_dataset = get_dataset("json", data_files="train.jsonl")
trainer = Seq2SeqTrainer(model=model, train_dataset=train_dataset)
trainer.train()