{"id":9059,"library":"k-diffusion","title":"K-Diffusion","description":"K-Diffusion is a PyTorch library implementing the improved diffusion models from Karras et al. (2022). It provides a highly optimized collection of samplers (e.g., DPM-Solver, Euler) and utilities for building and running stable diffusion models. The current version is 0.1.1.post1, and it maintains an active, community-driven release schedule primarily focused on stability and integration with other generative AI projects.","status":"active","version":"0.1.1.post1","language":"en","source_language":"en","source_url":"https://github.com/crowsonkb/k-diffusion","tags":["diffusion models","pytorch","deep learning","generative AI","karras"],"install":[{"cmd":"pip install k-diffusion","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core deep learning framework","package":"torch","optional":false},{"reason":"Progress bar for sampling","package":"tqdm","optional":false},{"reason":"For loading and saving model weights securely","package":"safetensors","optional":false}],"imports":[{"note":"Early versions might have used `sample_dpmpp_2m_sde_v2` or similar, but direct `sample_dpmpp_2m` is a common and stable choice. Always check the available samplers in `k_diffusion.sampling`.","wrong":"from k_diffusion.sampling import sample_dpmpp_2m_sde_v2","symbol":"sample_dpmpp_2m","correct":"from k_diffusion import sampling\n# ... sampling.sample_dpmpp_2m(...)"},{"note":"`CompVisDenoiser` is part of the `external` module, designed to wrap models from other libraries like CompVis/Stable Diffusion.","wrong":"from k_diffusion.models import CompVisDenoiser","symbol":"CompVisDenoiser","correct":"from k_diffusion import external\n# ... external.CompVisDenoiser(...)"}],"quickstart":{"code":"import torch\nfrom k_diffusion import sampling, external\n\n# 1. Define a dummy UNet-like model (replace with your actual pre-trained UNet)\n# This mock UNet simulates a model expecting (latent, timestep, conditioning) input.\nclass DummyUNet(torch.nn.Module):\n    def __init__(self, in_channels=4, out_channels=4, img_size=64):\n        super().__init__()\n        self.conv = torch.nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1)\n        self.relu = torch.nn.ReLU()\n    def forward(self, x, timesteps, context=None):\n        # In a real UNet, timesteps and context would be used for conditioning.\n        return self.relu(self.conv(x))\n\n# Instantiate the dummy UNet\ninner_model = DummyUNet()\n\n# 2. Wrap the UNet with k-diffusion's external denoiser (e.g., for Stable Diffusion latents)\n# This wrapper adapts the UNet's API to k-diffusion's expected (x, sigma) signature.\nmodel_wrap = external.CompVisDenoiser(inner_model)\nmodel_wrap.eval().cpu() # Set to eval mode and move to CPU for quickstart simplicity\n\n# 3. Prepare initial noisy latents and define the sampling schedule\nbatch_size = 1\nchannels = 4 # Common for Stable Diffusion latent space\nheight, width = 64, 64 # Latent resolution (e.g., 512x512 image -> 64x64 latent)\ninitial_noise = torch.randn(batch_size, channels, height, width, device='cpu') * 8.0\nsigmas = sampling.get_sigmas_karras(n=40, sigma_min=0.1, sigma_max=8.0, device='cpu')\n\n# 4. Run the sampling process using a DPM++ 2M sampler\n# The sampler takes the wrapped model, initial noise, and the sigma schedule.\nwith torch.no_grad():\n    print(\"Starting K-Diffusion sampling (DPM++ 2M)...\")\n    denoised_latents = sampling.sample_dpmpp_2m(\n        model_wrap,           # The wrapped model callable\n        initial_noise,        # Initial noisy latents\n        sigmas                # Sigma schedule\n        # Optional: `extra_args` can pass conditioning, e.g., {'cond': text_embeddings}\n    )\n    print(f\"Sampling complete. Denoised latents shape: {denoised_latents.shape}\")\n    # In a real pipeline, `denoised_latents` would then be decoded to an image.","lang":"python","description":"This quickstart demonstrates how to set up a dummy UNet model, wrap it using `k_diffusion.external.CompVisDenoiser` to conform to the library's API, and perform a basic sampling step using `sample_dpmpp_2m`. In a real application, the `DummyUNet` would be replaced by your actual pre-trained model (e.g., a Stable Diffusion UNet)."},"warnings":[{"fix":"Always refer to the latest documentation or source code for the exact sampler function names and required arguments. Stick to `k_diffusion` versions 0.1.0+ for better API stability.","message":"Sampler function signatures and names changed in early versions (pre-0.1.0). For example, `sample_dpmpp_2m_sde` might have been replaced by a newer version or slightly different arguments.","severity":"breaking","affected_versions":"< 0.1.0"},{"fix":"Use the provided `k_diffusion.external` wrappers (e.g., `external.CompVisDenoiser(unet_model)`). If building a custom model, ensure its `forward` method has the signature `forward(self, x, sigma, conditioning=None)` or similar, adapted to how `k-diffusion` samplers call it.","message":"K-Diffusion expects models to conform to a specific API where the forward pass takes `(x, sigma)` (and optionally `conditioning`). If you're wrapping an external UNet, ensure you use wrappers like `external.CompVisDenoiser` or `external.AutoencoderKLWrapper` correctly, or adapt your custom model's forward method.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Carefully check input `x` and `sigma` shapes. Ensure your underlying UNet model's output (when wrapped) aligns with what `k-diffusion` expects, often by normalizing to `[-1, 1]` or `[0, 1]` or outputting predicted noise directly.","message":"Tensor shape and value ranges are crucial. `k-diffusion` typically operates on latents (e.g., `(B, C, H, W)`) and expects `sigma` values, not raw `timestep` integers, for its samplers. The output of the wrapped model (denoised output or predicted noise) must also match expectations.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Reduce batch size, use smaller model versions, or offload parts of the model to CPU if supported by a wrapper. Ensure all tensors are consistently on the same device (e.g., `model.to('cuda')`, `x.to('cuda')`).","message":"CUDA out of memory errors are common when using large models or high batch sizes, especially without sufficient GPU memory or when mixing CPU/GPU tensors incorrectly.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Run `pip install k-diffusion` to install the library.","cause":"The k-diffusion library is not installed in your Python environment.","error":"ModuleNotFoundError: No module named 'k_diffusion'"},{"fix":"You likely need to wrap your model using `k_diffusion.external.CompVisDenoiser` or a similar wrapper that adapts the `k-diffusion` API to your model's native `forward` signature.","cause":"Your underlying PyTorch model's `forward` method does not accept `sigma` as an argument directly, but a `k_diffusion` sampler is trying to pass it.","error":"TypeError: forward() got an unexpected keyword argument 'sigma'"},{"fix":"Reduce the `batch_size`, use a smaller image resolution, or offload parts of your model to CPU if the architecture allows (e.g., with specific `diffusers` pipelines). Consider using a GPU with more VRAM.","cause":"Your GPU does not have enough memory to run the model or sampling process with the current settings.","error":"RuntimeError: CUDA out of memory. Tried to allocate X GiB (GPU N; X GiB total capacity; Y GiB already allocated; Z GiB free; P MiB reserved in total by PyTorch)"},{"fix":"Check the available functions in the `k_diffusion.sampling` module by using `dir(k_diffusion.sampling)` or consult the official GitHub repository for the correct sampler names for your version.","cause":"The specific sampler function name you are trying to use does not exist or has been renamed in your installed version of `k-diffusion`.","error":"AttributeError: module 'k_diffusion.sampling' has no attribute 'sample_dpmpp_2m_sde_v2'"}]}