SAE Lens

raw JSON →
6.43.0 verified Fri May 01 auth: no python

SAE Lens is a library for training, loading, and analyzing sparse autoencoders (SAEs) on transformer language models. Current version is 6.43.0, with frequent releases (multiple versions per month).

pip install sae-lens
error ModuleNotFoundError: No module named 'sae_lens'
cause Package not installed.
fix
Run pip install sae-lens.
error AssertionError: Expected activation shape (batch, seq_len, d_model) but got ...
cause Activations passed to SAE.encode have incorrect shape or are on wrong device.
fix
Check that activations are a 3D tensor on the same device as the SAE. Use sae.encode(act) where act is shape (batch, seq_len, d_model).
error ValueError: Unknown release: ...
cause Provided release name does not exist in the SAE registry.
fix
Use a valid release name from sae_lens.known_releases() or check the docs.
breaking In v6.x, the `SAE.from_pretrained` signature changed: the `release` argument is now the first positional argument and required. Old code using `SAE.from_pretrained(sae_id=...)` without `release` will break.
fix Update to: SAE.from_pretrained(release=..., sae_id=...)
gotcha SAE expects activations on the same device as the SAE itself. Cross-device (e.g., model on GPU, SAE on CPU) can cause silent errors or crashes.
fix Ensure model and SAE are on the same device (use sae.to(device) and model.to(device)).
deprecated The `cache` parameter in `SAE.encode` is deprecated and will be removed in a future version. Use `HookedSAETransformer` or `model.run_with_cache` explicitly.
fix Switch to using `model.run_with_cache` and pass the activations directly to `sae.encode`.

Load a pretrained SAE and compute feature activations for a prompt.

from sae_lens import SAE
from transformer_lens import HookedTransformer

model = HookedTransformer.from_pretrained("gpt2-small", device="cpu")
sae, cfg_dict, sparsity = SAE.from_pretrained(release="gpt2-small-res-jb", sae_id="blocks.0.hook_resid_pre", device="cpu")
sae.to("cpu")

# Example: get SAE feature activations for a prompt
prompt = "Hello, world!"
_, cache = model.run_with_cache(prompt, names_filter=[sae.cfg.hook_name])
act = cache[sae.cfg.hook_name]
sae_acts = sae.encode(act)
print(sae_acts.shape)