Token Merging for Stable Diffusion (tomesd)

0.1.3 · active · verified Thu Apr 16

tomesd is a Python library that implements Token Merging (ToMe) for Stable Diffusion models. It accelerates inference by merging redundant tokens, reducing the computational load on the transformer blocks without requiring model retraining. The library is pure Python and PyTorch-based, currently at version 0.1.3, with an active development and release cadence focused on performance improvements and compatibility.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to apply `tomesd` to a `diffusers` Stable Diffusion pipeline. It loads a pre-trained model, applies the `tomesd` patch with a specified merging ratio, and then generates an image. The `ratio` parameter is key to balancing speedup and image quality.

import torch, tomesd
from diffusers import StableDiffusionPipeline
import os

# Ensure you have a Hugging Face token set up if accessing private models
# For public models, this is often not strictly needed, but good practice
# Replace 'YOUR_HF_TOKEN' with an actual token or use environment variable
hf_token = os.environ.get('HF_TOKEN', '') 

# 1. Load a Stable Diffusion pipeline
pipeline = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    # Pass token if required for private models
    # use_auth_token=hf_token if hf_token else None
).to("cuda")

# 2. Apply ToMe with a 50% merging ratio
# The 'ratio' parameter controls the amount of token merging. Higher ratio means more speedup, potentially lower quality.
tomesd.apply_patch(pipeline, ratio=0.5)

# 3. Generate an image
prompt = "a photo of an astronaut riding a horse on mars"
image = pipeline(prompt).images[0]

# 4. Save the image
image.save("astronaut.png")
print("Image generated and saved as astronaut.png")

view raw JSON →