Curated Transformers
Curated Transformers is a PyTorch library offering a collection of transformer models and components, optimized for efficiency and integration. It provides a consistent API for popular models like MPT, Llama 2, Falcon, BERT, and ELECTRA, supporting features like generation, 8-bit/4-bit quantization, and Safetensor checkpoints. The latest version is 2.0.1, with releases occurring as needed for features and bug fixes, often with a focus on PyTorch compatibility.
Warnings
- breaking In v2.0.0, the `HFHubRepository` class was renamed to `HFHubModelRepository`. Additionally, `AutoEncoder`, `AutoDecoder`, and `AutoCausalLM` models now require an explicit `repository` object instead of a direct `revision` argument for loading.
- breaking In v2.0.0, the `MPTModelOutput` class was changed to only output `last_hidden_state`. Previous versions also included `past_key_values`.
- gotcha Prior to v2.0.0, `curated-transformers` had strict upper bounds on supported PyTorch versions (e.g., v1.3.1 was limited to PyTorch <2.1.0). Attempting to use incompatible PyTorch versions could lead to runtime errors.
- gotcha Versions `2.0.0`, `1.3.0`, and `1.3.1` of `curated-transformers` contained a bug that caused an activation lookup error when running on Python 3.12.3.
- gotcha In v2.0.0, the parameter `n_vocab` in `curated_transformers.models.bertish.BertEmbedding` and `curated_transformers.models.gpt_neo.GPTNeoEmbedding` was renamed to `n_pieces`.
Install
-
pip install curated-transformers torch safetensors huggingface_hub
Imports
- AutoCausalLM
from curated_transformers.models import AutoCausalLM
- AutoEncoder
from curated_transformers.models import AutoEncoder
- AutoDecoder
from curated_transformers.models import AutoDecoder
- AutoTokenizer
from curated_transformers.tokenization import AutoTokenizer
- Generator
from curated_transformers.generation import Generator
Quickstart
import torch
from curated_transformers.generation import Generator
from curated_transformers.models import AutoCausalLM
from curated_transformers.tokenization import AutoTokenizer
# Example model, replace with your desired model
# Ensure `explosion/mpt-7b-peft-ct2-int4` is accessible or use a local path.
model = AutoCausalLM.from_hf_hub("explosion/mpt-7b-peft-ct2-int4")
tokenizer = AutoTokenizer.from_hf_hub("explosion/mpt-7b-peft-ct2-int4")
generator = Generator(model, tokenizer)
text = "Hello, I'm a language model,"
tokenized = tokenizer([text])
# Generate text (returns an iterator of generated strings)
for generated_text in generator(tokenized, max_length=50, eos_token_id=tokenizer.eos_token_id):
print(generated_text)