{"id":6475,"library":"torch-memory-saver","title":"Torch Memory Saver","description":"Torch Memory Saver is a PyTorch library designed to optimize GPU memory usage by allowing `torch` tensor memory to be temporarily released and resumed later. It enables developers to manage memory more efficiently, especially for large models or when performing operations that might exceed available VRAM. The library is actively developed, with its latest stable release, version 0.0.9, released in October 2025.","status":"active","version":"0.0.9","language":"en","source_language":"en","source_url":"https://github.com/fzyzcjy/torch_memory_saver","tags":["pytorch","memory-management","gpu","cuda","optimization","deep-learning"],"install":[{"cmd":"pip install torch-memory-saver","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core functionality relies on PyTorch's CUDA tensor and memory management features.","package":"torch","optional":true}],"imports":[{"symbol":"torch_memory_saver","correct":"import torch_memory_saver"},{"note":"Used as a context manager to define a memory region for tensors.","symbol":"region","correct":"from torch_memory_saver import region"},{"note":"Function to release memory for tensors in a defined region.","symbol":"pause","correct":"torch_memory_saver.pause()"},{"note":"Function to re-allocate memory for previously paused tensors.","symbol":"resume","correct":"torch_memory_saver.resume()"}],"quickstart":{"code":"import torch\nimport torch_memory_saver\nimport os\n\nif not torch.cuda.is_available():\n    print(\"CUDA is not available. This library is designed for GPU memory saving.\")\n    exit()\n\nprint(f\"Initial CUDA memory allocated: {torch.cuda.memory_allocated() / (1024**2):.2f} MB\")\n\n# 1. For tensors that want to be paused, create them within `region`\nwith torch_memory_saver.region():\n    # Create a large tensor (adjust size based on your GPU memory)\n    pauseable_tensor = torch.full((1_000_000_000,), 100, dtype=torch.uint8, device=\"cuda\") # ~1GB\n    print(f\"Tensor created. Current CUDA memory allocated: {torch.cuda.memory_allocated() / (1024**2):.2f} MB\")\n\n    # 2. Temporarily pause memory for tensors in this region\n    # By default, content is thrown away. Use `enable_cpu_backup=True` to preserve content.\n    torch_memory_saver.pause()\n    print(f\"Memory paused. Current CUDA memory allocated: {torch.cuda.memory_allocated() / (1024**2):.2f} MB\")\n\n    # At this point, `nvidia-smi` would show reduced GPU memory usage for the process.\n    # You can perform other memory-intensive operations here.\n\n    # 3. After `resume`, CUDA memory is re-occupied for those tensors.\n    torch_memory_saver.resume()\n    print(f\"Memory resumed. Current CUDA memory allocated: {torch.cuda.memory_allocated() / (1024**2):.2f} MB\")\n\n    # If `enable_cpu_backup=True` was used, you could now access `pauseable_tensor` and its content would be intact.\n    # print(f\"Tensor element value after resume (if backed up): {pauseable_tensor[0].item()}\")\n\n# Ensure to delete tensors and clear cache if running multiple experiments in a single script\ndel pauseable_tensor\nif torch.cuda.is_available():\n    torch.cuda.empty_cache()\n    print(f\"Final CUDA memory allocated: {torch.cuda.memory_allocated() / (1024**2):.2f} MB\")","lang":"python","description":"This quickstart demonstrates the core functionality of `torch-memory-saver`. It shows how to define a memory region, create a large tensor within it, and then temporarily release and resume its GPU memory. By default, tensor content is discarded during `pause()` for maximum memory savings, but it can be preserved using `enable_cpu_backup=True` when defining the region. The example also includes print statements to observe CUDA memory changes."},"warnings":[{"fix":"Use `with torch_memory_saver.region(enable_cpu_backup=True):` to enable CPU-based content backup during pause.","message":"By default, calling `torch_memory_saver.pause()` discards the content of the tensors in the region to maximize memory savings. If you need to preserve the tensor content for later use, you must instantiate the memory region with `with torch_memory_saver.region(enable_cpu_backup=True):`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Test thoroughly when combining `torch-memory-saver` with other low-level CUDA tools. Monitor memory behavior closely.","message":"The library operates by hooking into CUDA's memory allocation (either via `LD_PRELOAD` or PyTorch's custom allocator). This low-level intervention might conflict with other libraries or debugging tools that also modify CUDA memory behavior, potentially leading to unexpected errors or instability.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For CUDA Graph usage, always replace `torch.cuda.graph` with `torch_memory_saver.cuda_graph`.","message":"When utilizing PyTorch's CUDA Graph feature for performance optimization, you must replace `torch.cuda.graph(...)` with `torch_memory_saver.cuda_graph(...)`. This ensures compatibility with the memory saver and allows the release of intermediate tensor memory within the graph, preventing memory accumulation.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Combine this library with standard PyTorch memory optimization techniques for comprehensive memory management.","message":"While `torch-memory-saver` helps manage tensor memory, it does not resolve all general PyTorch memory issues. Developers should still follow best practices such as detaching tensors from the computation graph (`.detach()`) when they are not needed for gradients, using `torch.no_grad()` for inference, and explicitly deleting unused objects (`del var; gc.collect(); torch.cuda.empty_cache()`) to prevent other types of memory leaks.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z"}