Axial Positional Embedding

raw JSON →
0.3.12 verified Fri May 01 auth: no python

Implementation of Axial Positional Embedding, as described in 'Attention is All You Need' and 'Axial Attention in Transformers'. This library provides a simple way to add positional encodings to transformer models in an axial (factorized) manner, reducing the number of parameters. Current version 0.3.12, supports Python >=3.8.

pip install axial-positional-embedding
error ImportError: cannot import name 'AxialPositionalEmbedding'
cause Import path incorrect; likely using a wrong module name.
fix
Use: from axial_positional_embedding import AxialPositionalEmbedding
error RuntimeError: The expanded size of the tensor must match the existing size
cause Input shape mismatch: the positional embedding expects a specific shape (height, width) and may broadcast incorrectly if dimensions don't align.
fix
Ensure input tensor has shape (batch, height, width, dim) after permutation.
error AssertionError: dim must be divisible by 2?
cause Some implementations require even dimension for sin/cos positional encoding; this library currently does not enforce but version may vary.
fix
If you encounter this error, ensure dim is even.
gotcha Input tensor shape: The library expects input shape (batch, height, width, channels) for 2D axial, not (batch, channels, height, width). If your data is in channel-first format, you must permute before passing.
fix If using PyTorch's usual image format (NCHW), permute to (batch, height, width, channels) before applying: x = x.permute(0, 2, 3, 1)
gotcha Dimension mismatch: The 'dim' parameter must match the channel dimension of the input tensor. Common mistake: setting dim=128 but input has 64 channels.
fix Ensure dim equals the last dimension of the input tensor after permutation.
deprecated The library does not actively deprecate features, but the original 'axial-positional-embedding' may see reduced updates. For production, consider using Hugging Face Transformers' built-in axial positional embedding.
fix Check the GitHub repo for latest changes; if lacking, evaluate alternatives.

Basic usage: create an axial positional embedding layer for 2D spatial data and apply it to an input tensor.

import torch
from axial_positional_embedding import AxialPositionalEmbedding

# Example: create an embedding layer for a 2D image (height, width)
dim = 128
# For an input of shape (batch, seq_len, dim) or (batch, height, width, dim)?
# Typically, AxialPositionalEmbedding expects shape (batch, height, width, dim) for 2D axial.
# But the library is flexible; let's assume a 2D spatial input.
batch = 2
height = 16
width = 16

pos_emb = AxialPositionalEmbedding(dim=dim, shape=(height, width))
# Generate dummy input (batch, height, width, dim)
x = torch.randn(batch, height, width, dim)
# Apply positional embedding
out = pos_emb(x)
print(out.shape)  # Expected: (batch, height, width, dim)