Gymnasium Reinforcement Learning Library
Gymnasium provides a standard API for reinforcement learning environments, offering a diverse set of reference environments for research and development. It is the spiritual successor to OpenAI Gym, maintained by the Farama Foundation, and receives frequent minor releases with bug fixes, new features, and API improvements. The current version is 1.2.3.
Warnings
- breaking The library package name changed from `gym` to `gymnasium` starting with v0.29.0 and definitively from v1.0.0. All imports must be updated.
- breaking The `Env.reset()` method now returns a tuple `(observation, info)` instead of just `observation`. The `info` dictionary provides additional diagnostic information.
- breaking The `Env.step()` method now returns `(observation, reward, terminated, truncated, info)`. The single boolean `done` has been split into `terminated` (true if the environment reached a terminal state) and `truncated` (true if the episode ended due to a time limit or other external factor).
- breaking The `render_mode` argument in `gymnasium.make()` is now mandatory if you intend to render the environment. If rendering is not needed, set it to `None`.
- breaking MuJoCo v2 and v3 environments (e.g., 'Ant-v2', 'Humanoid-v3') have been moved to the `gymnasium-robotics` project. They are no longer part of the core `gymnasium` library.
- gotcha The `gymnasium[box2d]` extra now depends on the `box2d` package, replacing the older `box2d-py`. Installing the extra will handle this automatically, but manual installations might cause issues.
Install
-
pip install gymnasium -
pip install gymnasium[classic_control] -
pip install gymnasium[box2d] -
pip install gymnasium[mujoco] -
pip install 'gymnasium[all]' # installs all official extras
Imports
- gymnasium.make
import gymnasium as gym env = gym.make('CartPole-v1') - gymnasium.Env
import gymnasium as gym class MyEnv(gym.Env): ...
- gymnasium.spaces
import gymnasium as gym space = gym.spaces.Box(low=0, high=1, shape=(4,))
Quickstart
import gymnasium as gym
env = gym.make("CartPole-v1", render_mode="rgb_array")
observation, info = env.reset(seed=42) # seed is optional, for reproducibility
for _ in range(100):
action = env.action_space.sample() # agent policy that takes an observation and returns an action
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset(seed=42)
env.close()