{"id":6710,"library":"local-attention","title":"Local Attention","description":"local-attention is a Python library by lucidrains that implements local attention mechanisms with configurable windowing and lookback/lookforward options, primarily for language modeling tasks. It leverages PyTorch for efficient computation and is actively maintained with frequent minor and patch releases.","status":"active","version":"1.11.2","language":"en","source_language":"en","source_url":"https://github.com/lucidrains/local-attention","tags":["attention","deep-learning","transformers","pytorch","language-modeling","causal-attention"],"install":[{"cmd":"pip install local-attention","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core deep learning framework dependency for tensor operations and model building.","package":"torch","optional":false},{"reason":"Used for flexible and readable tensor manipulations (rearranging, reducing, repeating).","package":"einops","optional":false}],"imports":[{"symbol":"LocalAttention","correct":"from local_attention import LocalAttention"}],"quickstart":{"code":"import torch\nfrom local_attention import LocalAttention\n\n# Ensure reproducibility\ntorch.manual_seed(42)\n\n# Define the local attention layer\n# window_size defines the local neighborhood.\n# look_backward=1 means each token looks at (window_size) tokens to its left.\n# look_forward=0 means it does not look at tokens to its right (causal).\nattn = LocalAttention(\n    window_size = 512,\n    look_backward = 1,\n    look_forward = 0,\n    dropout = 0.,\n    causal = True, # Set to True for autoregressive models\n    exact_windowsize = False\n)\n\n# Create a dummy input tensor: (batch, sequence_length, feature_dimension)\n# For example, a batch of 1 sequence, 1024 tokens long, with 512 features per token.\nx = torch.randn(1, 1024, 512)\n\n# Apply local attention\ny = attn(x)\n\nprint(f\"Input shape: {x.shape}\")\nprint(f\"Output shape: {y.shape}\")\n# The output shape should be the same as the input shape\nassert x.shape == y.shape\nprint(\"Local attention applied successfully.\")\n","lang":"python","description":"Initializes a `LocalAttention` module with specified windowing parameters and applies it to a dummy input tensor. This example demonstrates a causal attention setup suitable for autoregressive models."},"warnings":[{"fix":"Carefully consult the documentation and examples for each parameter. Test with small synthetic inputs to verify the attention mask behavior, especially for autoregressive tasks where strict causality is essential.","message":"Misunderstanding the interplay between `window_size`, `look_backward`, `look_forward`, and `causal` can lead to unintended attention patterns or incorrect information flow. For instance, `causal=True` combined with `look_forward > 0` might not behave as expected for strict autoregression.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Explicitly move tensors to the target device (e.g., `tensor.to(device)`) and ensure `dtype` consistency, especially when mixing `local-attention` with other PyTorch components or models loaded with different dtypes.","message":"As a PyTorch-based library, ensuring input tensors are on the correct device (CPU/GPU) and have compatible data types (`torch.float32`, `torch.float16`) is crucial. Mismatches frequently cause runtime errors or significantly degraded performance.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Verify your input tensor's dimensions before passing it to `LocalAttention`. If necessary, use `tensor.transpose(1, 2)` or `einops.rearrange` to correct the shape.","message":"The library expects input tensors of shape `(batch, sequence_length, feature_dimension)`. Incorrectly shaped inputs, particularly transposing `sequence_length` and `feature_dimension`, are a common source of `RuntimeError` or `ValueError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Review custom implementations that might interact with the internal `look_around` logic. For standard usage, this update is a transparent improvement.","message":"Version 1.11.0 updated the internal `look_around()` function to use native PyTorch functionality. While not a public API breaking change, it represents a significant internal optimization. If you were relying on previous internal behaviors (e.g., via subclassing or monkey-patching), this change could affect your custom logic.","severity":"gotcha","affected_versions":">=1.11.0"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}