# Browser Use — AI Agent Browser Automation

> Make any website accessible to AI agents. Automate browser tasks with LLMs — click, type, navigate, extract data. 70K+ stars, MIT licensed.

## Install

Save as a script file and run:

# Browser Use — AI Agent Browser Automation

## Quick Use

```bash
pip install browser-use
playwright install chromium
```

```python
from browser_use import Agent
from langchain_openai import ChatOpenAI

agent = Agent(
    task="Go to reddit.com/r/python, find the top post today, and summarize it",
    llm=ChatOpenAI(model="gpt-4o"),
)

result = await agent.run()
print(result)
```

Set your API key: `export OPENAI_API_KEY=sk-...` (or use Anthropic, Gemini, etc.)

## Introduction

Browser Use is an **open-source library that makes websites accessible to AI agents**. It bridges the gap between LLMs and real web browsers, enabling agents to autonomously navigate pages, fill forms, click buttons, extract data, and complete multi-step workflows.

Core capabilities:

- **Vision + HTML Extraction** — Combines visual understanding with DOM analysis for robust element detection, even on complex dynamic pages
- **Multi-Tab Management** — Agents can open, switch between, and manage multiple browser tabs simultaneously
- **Automatic Error Recovery** — Self-correcting agents that handle popups, CAPTCHAs, and unexpected page states
- **Parallel Agents** — Run multiple browser agents concurrently for batch processing tasks
- **Custom Actions** — Define reusable browser actions (save to file, send notification, call API) that agents can invoke
- **LLM Agnostic** — Works with OpenAI, Anthropic Claude, Google Gemini, DeepSeek, and any LangChain-compatible model
- **Session Persistence** — Connect to existing Chrome sessions with cookies and login state preserved

70,000+ GitHub stars. Used for web scraping, form automation, testing, data collection, and building autonomous web agents.

## FAQ

**Q: How does Browser Use differ from Playwright or Selenium?**
A: Playwright/Selenium require you to write explicit selectors and step-by-step scripts. Browser Use lets you describe tasks in natural language, and the AI agent figures out how to interact with the page autonomously.

**Q: Does it work with sites that require login?**
A: Yes. You can either let the agent log in with credentials, or connect to an existing Chrome session where you're already authenticated.

**Q: Can I run it headless (no visible browser)?**
A: Yes. Pass `headless=True` to the Browser config. This is useful for server-side automation and CI/CD pipelines.

**Q: How much does it cost to run?**
A: Cost depends on the LLM provider. Each page interaction typically uses 1-3K tokens. A typical 10-step task costs ~$0.05-0.15 with GPT-4o.

## Works With

- OpenAI / Anthropic / Google / DeepSeek / any LangChain LLM
- Playwright (Chromium) for browser control
- Python 3.11+ async/await
- Docker for containerized deployment

## Source & Thanks

- GitHub: [browser-use/browser-use](https://github.com/browser-use/browser-use)
- License: MIT
- Stars: 70,000+
- Maintainer: Browser Use team

Thanks to the Browser Use team for creating the most popular open-source browser automation framework for AI agents, making the web programmable through natural language.


---
Source: https://tokrepo.com/en/workflows/0e7836b7-a471-4867-8c63-60c675b6fe8f
Author: TokRepo精选