# Browser Use — AI Agent Browser Automation

> Let AI agents control web browsers with natural language. Browser Use provides vision-based element detection, multi-tab support, and works with any LLM provider.

## Install

Merge the JSON below into your `.mcp.json`:

## Quick Use

```bash
pip install browser-use
playwright install
```

```python
from browser_use import Agent
from langchain_anthropic import ChatAnthropic

agent = Agent(
    task="Go to GitHub trending and find the top Python repo",
    llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
)
result = await agent.run()
print(result)
```

## What is Browser Use?

Browser Use is a Python library that gives AI agents the ability to control web browsers. It uses vision-based element detection to understand page layout, supports multi-tab browsing, and works with any LLM — enabling agents to complete real web tasks autonomously.

**Answer-Ready**: Browser Use is an AI agent browser automation library that enables LLMs to control web browsers with vision-based element detection, multi-tab support, and natural language task execution. 50k+ GitHub stars.

**Best for**: AI agent developers who need web browsing capabilities. **Works with**: Claude, GPT-4o, Gemini, any LangChain-compatible model. **Setup time**: Under 3 minutes.

## Core Features

### 1. Vision-Based Interaction
Browser Use screenshots the page and identifies interactive elements:

```python
agent = Agent(
    task="Search for 'AI tools' on Google and click the first result",
    llm=llm,
)
# Agent sees the page, identifies search box, types, clicks results
```

### 2. Multi-Tab Support

```python
agent = Agent(
    task="Open three tabs: GitHub, HN, and Reddit. Find the top AI post on each.",
    llm=llm,
)
```

### 3. Custom Actions

```python
from browser_use import Agent, Controller

controller = Controller()

@controller.action("Save data to file")
def save_data(data: str, filename: str):
    with open(filename, 'w') as f:
        f.write(data)

agent = Agent(
    task="Scrape product prices and save to prices.csv",
    llm=llm,
    controller=controller,
)
```

### 4. Persistent Sessions

```python
from browser_use import BrowserConfig

config = BrowserConfig(
    headless=False,        # Watch it work
    keep_open=True,        # Keep browser open after task
    cookies_file="cookies.json",  # Persist login
)
agent = Agent(task="...", llm=llm, browser_config=config)
```

### 5. MCP Server Mode

```json
{
  "mcpServers": {
    "browser-use": {
      "command": "uvx",
      "args": ["browser-use-mcp-server"]
    }
  }
}
```

Use Browser Use as an MCP server in Claude Code or other MCP-compatible tools.

## Use Cases

| Use Case | Example |
|----------|---------|
| Research | Gather data from multiple websites |
| Testing | E2E test web applications |
| Automation | Fill forms, submit applications |
| Monitoring | Check prices, track changes |

## FAQ

**Q: How does it compare to Playwright MCP?**
A: Playwright MCP provides low-level browser control. Browser Use adds AI vision and autonomous task execution on top of Playwright.

**Q: Does it work with Claude Code?**
A: Yes, via MCP server mode. Install the browser-use-mcp-server package.

**Q: Can it handle login-protected pages?**
A: Yes, with persistent cookies or by letting the agent perform the login flow.

## Source & Thanks

> Created by [Browser Use Team](https://github.com/browser-use). Licensed under MIT.
>
> [browser-use/browser-use](https://github.com/browser-use/browser-use) — 50k+ stars

<!-- ZH -->


## Quick Start

```bash
pip install browser-use
playwright install
```

Three lines of code let an AI agent control a browser to complete web tasks.

## What is Browser Use?

Browser Use is a Python library that lets AI agents control web browsers. It uses visual detection to understand page layouts, supports multi-tab browsing, and works with any LLM.

**In one sentence**: Browser Use is an AI agent browser automation library supporting visual element detection, multi-tab browsing, and natural-language task execution — 50k+ GitHub stars.

**For**: AI agent developers who need browsing capability. **Supported models**: Claude, GPT-4o, Gemini.

## Core Features

### 1. Visual Interaction
Takes screenshots, identifies interactive elements, and operates autonomously.

### 2. Multi-Tab
Open multiple tabs and work on them in parallel.

### 3. Custom Actions
Register custom functions — the agent calls them automatically.

### 4. MCP Server Mode
Connect to Claude Code and other tools via MCP server mode.

## FAQ

**Q: How does it compare to Playwright MCP?**
A: Playwright MCP is low-level browser control; Browser Use adds AI vision and autonomous task execution.

**Q: Does it work with Claude Code?**
A: Yes, via MCP server mode.

## Source & Thanks

> [browser-use/browser-use](https://github.com/browser-use/browser-use) — 50k+ stars, MIT

---
Source: https://tokrepo.com/en/workflows/browser-use-ai-agent-browser-automation-3d04e209
Author: Browser Use