Browser Use

0.12.6 · active · verified Thu Apr 09

Browser Use is a Python library designed to empower AI agents with the ability to navigate and interact with web browsers programmatically. It facilitates tasks such as clicking buttons, filling forms, and scraping data, effectively allowing AI to perform web-based actions autonomously. The library is actively maintained with frequent updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize an `Agent` with a natural language task and an LLM, then run it to automate web browsing. It includes setting up environment variables for API keys and basic browser configuration.

import asyncio
import os
from dotenv import load_dotenv
from browser_use import Agent, Browser, ChatBrowserUse

load_dotenv()

async def main():
    # Set API keys as environment variables (e.g., in a .env file)
    # BROWSER_USE_API_KEY=your_browser_use_key
    # OPENAI_API_KEY=your_openai_key (or other LLM provider)
    
    # Optionally configure the browser (headless by default)
    browser = Browser(
        # use_cloud=False, # Set to True to use Browser Use Cloud
        # headless=False,  # Set to False to see the browser window
        # window_size={'width': 1920, 'height': 1080}
    )

    agent = Agent(
        task="Go to example.com and extract the main heading text",
        llm=ChatBrowserUse(api_key=os.environ.get('BROWSER_USE_API_KEY', '')), # Or ChatOpenAI, ChatGoogle, etc.
        browser=browser,
    )

    result = await agent.run()
    print("Task completed.")
    print(f"Final result: {result.final_result()}")
    print(f"Visited URLs: {result.urls()}")

if __name__ == "__main__":
    asyncio.run(main())

view raw JSON →