Curated Transformers

2.0.1 · active · verified Tue Apr 14

Curated Transformers is a PyTorch library offering a collection of transformer models and components, optimized for efficiency and integration. It provides a consistent API for popular models like MPT, Llama 2, Falcon, BERT, and ELECTRA, supporting features like generation, 8-bit/4-bit quantization, and Safetensor checkpoints. The latest version is 2.0.1, with releases occurring as needed for features and bug fixes, often with a focus on PyTorch compatibility.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load a causal language model and its tokenizer from the Hugging Face Hub, and then use the `Generator` to produce new text. This requires `huggingface_hub` and `safetensors` to be installed.

import torch
from curated_transformers.generation import Generator
from curated_transformers.models import AutoCausalLM
from curated_transformers.tokenization import AutoTokenizer

# Example model, replace with your desired model
# Ensure `explosion/mpt-7b-peft-ct2-int4` is accessible or use a local path.
model = AutoCausalLM.from_hf_hub("explosion/mpt-7b-peft-ct2-int4")
tokenizer = AutoTokenizer.from_hf_hub("explosion/mpt-7b-peft-ct2-int4")
generator = Generator(model, tokenizer)

text = "Hello, I'm a language model,"
tokenized = tokenizer([text])

# Generate text (returns an iterator of generated strings)
for generated_text in generator(tokenized, max_length=50, eos_token_id=tokenizer.eos_token_id):
    print(generated_text)

view raw JSON →