{"id":7795,"library":"tomesd","title":"Token Merging for Stable Diffusion (tomesd)","description":"tomesd is a Python library that implements Token Merging (ToMe) for Stable Diffusion models. It accelerates inference by merging redundant tokens, reducing the computational load on the transformer blocks without requiring model retraining. The library is pure Python and PyTorch-based, currently at version 0.1.3, with an active development and release cadence focused on performance improvements and compatibility.","status":"active","version":"0.1.3","language":"en","source_language":"en","source_url":"https://github.com/dbolya/tomesd","tags":["stable diffusion","token merging","diffusion models","optimization","pytorch","diffusers","performance"],"install":[{"cmd":"pip install tomesd","lang":"bash","label":"Install from PyPI"}],"dependencies":[{"reason":"Required for underlying tensor operations and model patching. PyTorch >= 1.12.1 is needed for `scatter_reduce` functionality.","package":"torch","optional":false},{"reason":"Commonly used framework for Stable Diffusion models, which tomesd is designed to optimize. The quickstart heavily features its integration.","package":"diffusers","optional":true}],"imports":[{"note":"The primary interaction is through the 'tomesd' module, typically calling `tomesd.apply_patch()` on a model.","wrong":"from tomesd import apply_patch","symbol":"tomesd","correct":"import tomesd"},{"note":"While importing `apply_patch` directly might work, the standard and documented approach is to import `tomesd` and call `tomesd.apply_patch()`. This avoids potential naming conflicts and aligns with examples.","wrong":"from tomesd import apply_patch; apply_patch(model, ratio=0.5)","symbol":"apply_patch","correct":"tomesd.apply_patch(model, ratio=0.5)"}],"quickstart":{"code":"import torch, tomesd\nfrom diffusers import StableDiffusionPipeline\nimport os\n\n# Ensure you have a Hugging Face token set up if accessing private models\n# For public models, this is often not strictly needed, but good practice\n# Replace 'YOUR_HF_TOKEN' with an actual token or use environment variable\nhf_token = os.environ.get('HF_TOKEN', '') \n\n# 1. Load a Stable Diffusion pipeline\npipeline = StableDiffusionPipeline.from_pretrained(\n    \"runwayml/stable-diffusion-v1-5\",\n    torch_dtype=torch.float16,\n    # Pass token if required for private models\n    # use_auth_token=hf_token if hf_token else None\n).to(\"cuda\")\n\n# 2. Apply ToMe with a 50% merging ratio\n# The 'ratio' parameter controls the amount of token merging. Higher ratio means more speedup, potentially lower quality.\ntomesd.apply_patch(pipeline, ratio=0.5)\n\n# 3. Generate an image\nprompt = \"a photo of an astronaut riding a horse on mars\"\nimage = pipeline(prompt).images[0]\n\n# 4. Save the image\nimage.save(\"astronaut.png\")\nprint(\"Image generated and saved as astronaut.png\")","lang":"python","description":"This quickstart demonstrates how to apply `tomesd` to a `diffusers` Stable Diffusion pipeline. It loads a pre-trained model, applies the `tomesd` patch with a specified merging ratio, and then generates an image. The `ratio` parameter is key to balancing speedup and image quality."},"warnings":[{"fix":"Upgrade to `tomesd` v0.1.3 or later. If using older versions, ensure a separate RNG or consistent seed setting for each batch if reproducibility is critical.","message":"Prior to v0.1.3, `tomesd`'s random perturbations could affect the global torch seed, potentially leading to inconsistencies in image generation if not explicitly managed. [cite: v0.1.3]","severity":"gotcha","affected_versions":"< 0.1.3"},{"fix":"Upgrade to `tomesd` v0.1.2 or later, which automatically disables `use_rand` for odd batch sizes to prevent this issue. Alternatively, ensure an even batch size.","message":"When `use_rand` is enabled, odd batch sizes (where prompted and unprompted images are not in the same batch) could lead to artifacting. [cite: v0.1.2]","severity":"gotcha","affected_versions":"< 0.1.2"},{"fix":"Experiment with different `ratio` values (e.g., 0.3 to 0.6) to find a balance between speedup and acceptable image quality for your specific use case.","message":"`tomesd` is a lossy process, meaning applying it will subtly change the generated image. While designed to minimize quality loss, aggressive merging (higher `ratio`) can degrade image quality.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Benchmark performance with your specific setup and image sizes. `tomesd` generally provides more substantial speedups for larger image resolutions (e.g., 1024x1024 and above) and typically works well in conjunction with other optimizations.","message":"Expected speedups from `tomesd` can vary significantly based on image resolution, batch size, and the underlying Stable Diffusion implementation (e.g., `diffusers` vs. original `runway-ml` repo). Smaller images or pipelines with existing optimizations (like `xformers` or `torch.compile`) might show less dramatic gains.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure `tomesd` is installed correctly in the active Python environment: `pip install tomesd`. If using a virtual environment (e.g., `venv`, `conda`), activate it before installing.","cause":"The `tomesd` package is not installed in the Python environment currently being used by your Stable Diffusion application or script.","error":"ModuleNotFoundError: No module named 'tomesd'"},{"fix":"Try reinstalling `tomesd`: `pip uninstall tomesd` followed by `pip install tomesd`. If installing from source, ensure `python setup.py build develop` was run successfully. Double-check the import statement is `import tomesd` before calling `tomesd.apply_patch()`.","cause":"This usually indicates an incomplete or corrupted installation, or an incorrect import. The `tomesd` module was found, but `apply_patch` is missing or not accessible. This could happen if a partial or incorrect `tomesd` directory exists in the Python path.","error":"Failed to apply ToMe patch, continuing as normal module 'tomesd' has no attribute 'apply_patch'"},{"fix":"Upgrade `tomesd` to the latest version, as compatibility with more resolutions was added in v0.1.1. If the problem persists, try adjusting image dimensions to multiples of 16 or using common Stable Diffusion resolutions like 512x512 or 768x768 for better compatibility. [cite: v0.1.1, 10]","cause":"This error often occurs when `tomesd` is applied to models generating at certain non-standard resolutions that cause tensor dimension mismatches during the token merging process. Issues with specific resolutions (e.g., 1920x1080) were reported early on.","error":"RuntimeError: The size of tensor a (X) must match the size of tensor b (Y) at non-singleton dimension Z"},{"fix":"Verify that `tomesd` is actually being called (e.g., add print statements). Ensure you are testing with larger image resolutions (e.g., 1024x1024) where `tomesd` typically provides more significant gains. Consider disabling other transformer optimizations temporarily to isolate the effect of `tomesd` and then re-evaluate.","cause":"This can occur on smaller image resolutions, with specific GPU architectures, or when other powerful optimizations (like `xformers` or `torch.compile` using SDPA) are already heavily optimizing the pipeline. The overhead of token merging might outweigh its benefits in such scenarios.","error":"Lower than expected speedup or even slowdown when using tomesd."}]}