BrowserGym WebArena
raw JSON → 0.14.3 verified Mon Apr 27 auth: no python
WebArena benchmark environment for BrowserGym, version 0.14.3. Provides a Gymnasium-compatible environment for evaluating web agents on realistic web interaction tasks.
pip install browsergym-webarena Common errors
error ModuleNotFoundError: No module named 'browsergym.webarena' ↓
cause Package not installed or imported with wrong name.
fix
Run 'pip install browsergym-webarena' and import as 'from browsergym.webarena import ...'
error gym.error.UnregisteredEnv: Cannot find environment with id 'browsergym/webarena' ↓
cause Deprecated environment ID used after version 0.14.0.
fix
Use a specific task ID, e.g., 'browsergym/webarena.0'.
error Error: Docker containers not running. Make sure you have started the WebArena infrastructure. ↓
cause Required Docker services for the benchmark websites are not running.
fix
Follow the WebArena setup instructions to start the Docker containers.
error TypeError: expected string or bytes-like object ↓
cause Agent returned a non-string action (e.g., a dict or array) to the environment.
fix
Convert action to a string before calling 'env.step(action)'.
Warnings
breaking WebArena requires a specific Docker image for the websites. The environment will fail to initialize if the Docker containers are not running. ↓
fix Pull and run the required Docker images as per the WebArena setup instructions before using the environment.
deprecated The 'browsergym/webarena' environment ID is deprecated in favor of task-specific IDs like 'browsergym/webarena.0'. ↓
fix Use 'browsergym/webarena.<task_id>' where <task_id> is an integer from 0 to 811.
gotcha The environment returns observations as dictionaries with 'screenshot' (PIL Image), 'text' (str), and other fields. Do not assume it returns a single array. ↓
fix Always access 'obs['screenshot']' or 'obs['text']' appropriately.
gotcha The environment uses Playwright under the hood. Do not run multiple environments in the same process without proper cleanup, or you may face port conflicts. ↓
fix Use 'env.close()' after each episode or use context managers.
gotcha WebArena tasks are defined with a specific evaluation function (teardown). The agent must return an action string; otherwise, the evaluation may not work correctly. ↓
fix Ensure your agent returns a string action from the 'action_space' (Text space) each step.
Imports
- WebArenaEnv wrong
from browsergym_webarena import WebArenaEnvcorrectfrom browsergym.webarena import WebArenaEnv - ALL_WEBARENA_TASKS wrong
from browsergym.webarena.tasks import ALL_WEBARENA_TASKScorrectfrom browsergym.webarena import ALL_WEBARENA_TASKS
Quickstart
import gymnasium as gym
import browsergym.webarena
env = gym.make('browsergym/webarena.0', headless=True)
obs, info = env.reset()
# run your agent
env.close()