RSL RL Lib
raw JSON → 5.2.0 verified Mon Apr 27 auth: no python
Fast and simple reinforcement learning algorithms (PPO) implemented in PyTorch, primarily designed for robotics applications like Isaac Lab. Current version 5.2.0, released April 2025. Active development with frequent releases.
pip install rsl-rl-lib Common errors
error ModuleNotFoundError: No module named 'rsl_rl.runner' ↓
cause Module renamed from 'runner' to 'runners' in v5.0.0.
fix
Use import 'from rsl_rl.runners import OnPolicyRunner'
error AttributeError: module 'rsl_rl' has no attribute 'algorithms' ↓
cause Old import path for PPO algorithm.
fix
Use 'from rsl_rl.algorithms import PPO'
error ValueError: The number of observations and actions do not match ↓
cause Mismatch between environment observation/action spaces and ActorCritic dimensions.
fix
Ensure num_actor_obs, num_critic_obs, num_actions match the environment's observation and action spaces.
Warnings
breaking v5.0.0 restructured library: modules 'runners' (not 'runner'), 'algorithms' (not 'algos'), 'models' (was 'actor_critic'). Old imports break. ↓
fix Update imports to new structure: from rsl_rl.runners import OnPolicyRunner; from rsl_rl.algorithms import PPO; from rsl_rl.models import ActorCritic
breaking v5.0.0 introduced Batch class; positional argument order in RolloutStorage changed. Incorrect ordering can cause silent tensor switching. ↓
fix Use named arguments (e.g., obs=, actions=) instead of positional when calling RolloutStorage methods.
deprecated v5.0.0 deprecates old configuration format. The new config uses nested dicts instead of flat parameters for algorithm and runner settings. ↓
fix Use config dict with keys 'algorithm' and 'runner' as shown in docs.
gotcha torch.compile mode 'default' may slow down training for small MLP networks; only beneficial for large CNNs. ↓
fix Disable compile for simple models: set model.cuda() and skip compile.
Imports
- Runner wrong
from rsl_rl.runner import OnPolicyRunnercorrectfrom rsl_rl.runners import OnPolicyRunner - PPO wrong
from rsl_rl.algos import PPOcorrectfrom rsl_rl.algorithms import PPO - ActorCritic
from rsl_rl.models import ActorCritic
Quickstart
import gym
import torch
from rsl_rl.runners import OnPolicyRunner
from rsl_rl.algorithms import PPO
from rsl_rl.modules import ActorCritic
# Initialize environment (example using gym)
env = gym.make('CartPole-v1')
# Setup model and algorithm
actor_critic = ActorCritic(
num_actor_obs=env.observation_space.shape[0],
num_critic_obs=env.observation_space.shape[0],
num_actions=env.action_space.shape[0],
).to('cuda')
algo = PPO(actor_critic=actor_critic, num_learning_epochs=5)
# Create runner and train
runner = OnPolicyRunner(env, algo, device='cuda')
runner.learn(num_learning_iterations=100, init_at_random_ep_len=True)
print('Training complete!')