Gymnasium Reinforcement Learning Library

1.2.3 · active · verified Thu Apr 09

Gymnasium provides a standard API for reinforcement learning environments, offering a diverse set of reference environments for research and development. It is the spiritual successor to OpenAI Gym, maintained by the Farama Foundation, and receives frequent minor releases with bug fixes, new features, and API improvements. The current version is 1.2.3.

Warnings

breaking The library package name changed from `gym` to `gymnasium` starting with v0.29.0 and definitively from v1.0.0. All imports must be updated.
Fix: Replace `import gym` with `import gymnasium as gym` (or similar) in all your code.
breaking The `Env.reset()` method now returns a tuple `(observation, info)` instead of just `observation`. The `info` dictionary provides additional diagnostic information.
Fix: Update calls from `obs = env.reset()` to `obs, info = env.reset()` and adjust your code to handle the `info` dictionary.
breaking The `Env.step()` method now returns `(observation, reward, terminated, truncated, info)`. The single boolean `done` has been split into `terminated` (true if the environment reached a terminal state) and `truncated` (true if the episode ended due to a time limit or other external factor).
Fix: Update calls from `obs, reward, done, info = env.step(action)` to `obs, reward, terminated, truncated, info = env.step(action)`. Adjust logic from `if done:` to `if terminated or truncated:`.
breaking The `render_mode` argument in `gymnasium.make()` is now mandatory if you intend to render the environment. If rendering is not needed, set it to `None`.
Fix: When calling `gymnasium.make()`, include `render_mode="rgb_array"`, `render_mode="human"`, or `render_mode=None` as appropriate for your use case.
breaking MuJoCo v2 and v3 environments (e.g., 'Ant-v2', 'Humanoid-v3') have been moved to the `gymnasium-robotics` project. They are no longer part of the core `gymnasium` library.
Fix: Install `gymnasium-robotics` (`pip install gymnasium-robotics`) and update environment IDs if you rely on these specific MuJoCo versions. For modern MuJoCo environments (v4), use `gymnasium[mujoco]`.
gotcha The `gymnasium[box2d]` extra now depends on the `box2d` package, replacing the older `box2d-py`. Installing the extra will handle this automatically, but manual installations might cause issues.
Fix: Ensure `pip install gymnasium[box2d]` or manually install `box2d` if managing dependencies yourself. Do not rely on `box2d-py` for `gymnasium` versions 1.2.3 and later.

Install

pip install gymnasium Core library
pip install gymnasium[classic_control] Classic control environments (e.g., CartPole)
pip install gymnasium[box2d] Box2D environments
pip install gymnasium[mujoco] Modern MuJoCo environments (v4)
pip install 'gymnasium[all]' # installs all official extras All environments

Imports

gymnasium.make
```
import gymnasium as gym
env = gym.make('CartPole-v1')
```
Gymnasium is the successor to OpenAI Gym; the package name changed. Use `import gymnasium as gym`.
gymnasium.Env
```
import gymnasium as gym
class MyEnv(gym.Env): ...
```
The core Env class is now directly under the top-level 'gymnasium' package.

gymnasium.spaces

import gymnasium as gym
space = gym.spaces.Box(low=0, high=1, shape=(4,))

Space definitions are accessed via 'gymnasium.spaces'.

Quickstart

Initializes a CartPole-v1 environment, steps through it for 100 timesteps using random actions, and resets when the episode terminates or truncates. Demonstrates the modern `render_mode` parameter, the `seed` argument for `reset()`, and the split `terminated`/`truncated` flags from `step()`.

import gymnasium as gym

env = gym.make("CartPole-v1", render_mode="rgb_array")
observation, info = env.reset(seed=42) # seed is optional, for reproducibility

for _ in range(100):
    action = env.action_space.sample()  # agent policy that takes an observation and returns an action
    observation, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        observation, info = env.reset(seed=42)
env.close()

view raw JSON →