RSL RL Lib

5.2.0 verified Mon Apr 27 auth: no python

Fast and simple reinforcement learning algorithms (PPO) implemented in PyTorch, primarily designed for robotics applications like Isaac Lab. Current version 5.2.0, released April 2025. Active development with frequent releases.

pip install rsl-rl-lib

Common errors

error ModuleNotFoundError: No module named 'rsl_rl.runner' ↓

cause Module renamed from 'runner' to 'runners' in v5.0.0.

fix

Use import 'from rsl_rl.runners import OnPolicyRunner'

error AttributeError: module 'rsl_rl' has no attribute 'algorithms' ↓

cause Old import path for PPO algorithm.

fix

Use 'from rsl_rl.algorithms import PPO'

error ValueError: The number of observations and actions do not match ↓

cause Mismatch between environment observation/action spaces and ActorCritic dimensions.

fix

Ensure num_actor_obs, num_critic_obs, num_actions match the environment's observation and action spaces.

Warnings

breaking v5.0.0 restructured library: modules 'runners' (not 'runner'), 'algorithms' (not 'algos'), 'models' (was 'actor_critic'). Old imports break. ↓

fix Update imports to new structure: from rsl_rl.runners import OnPolicyRunner; from rsl_rl.algorithms import PPO; from rsl_rl.models import ActorCritic

breaking v5.0.0 introduced Batch class; positional argument order in RolloutStorage changed. Incorrect ordering can cause silent tensor switching. ↓

fix Use named arguments (e.g., obs=, actions=) instead of positional when calling RolloutStorage methods.

deprecated v5.0.0 deprecates old configuration format. The new config uses nested dicts instead of flat parameters for algorithm and runner settings. ↓

fix Use config dict with keys 'algorithm' and 'runner' as shown in docs.

gotcha torch.compile mode 'default' may slow down training for small MLP networks; only beneficial for large CNNs. ↓

fix Disable compile for simple models: set model.cuda() and skip compile.

Imports

Runner

wrong

from rsl_rl.runner import OnPolicyRunner

correct

from rsl_rl.runners import OnPolicyRunner

Module is 'runners' (plural) since v5.0.0

PPO

wrong

from rsl_rl.algos import PPO

correct

from rsl_rl.algorithms import PPO

Module is 'algorithms' (not 'algos') since v5.0.0

ActorCritic
```
from rsl_rl.models import ActorCritic
```
Direct import; no common wrong import known

Quickstart

Basic training loop using OnPolicyRunner

import gym
import torch
from rsl_rl.runners import OnPolicyRunner
from rsl_rl.algorithms import PPO
from rsl_rl.modules import ActorCritic

# Initialize environment (example using gym)
env = gym.make('CartPole-v1')

# Setup model and algorithm
actor_critic = ActorCritic(
    num_actor_obs=env.observation_space.shape[0],
    num_critic_obs=env.observation_space.shape[0],
    num_actions=env.action_space.shape[0],
).to('cuda')
algo = PPO(actor_critic=actor_critic, num_learning_epochs=5)

# Create runner and train
runner = OnPolicyRunner(env, algo, device='cuda')
runner.learn(num_learning_iterations=100, init_at_random_ep_len=True)
print('Training complete!')