ScriptsApr 2, 2026·2 min read

Browser Use — AI Agent Browser Automation

Make any website accessible to AI agents. Automate browser tasks with LLMs — click, type, navigate, extract data. 70K+ stars, MIT licensed.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

```bash pip install browser-use playwright install chromium ``` ```python from browser_use import Agent from langchain_openai import ChatOpenAI agent = Agent( task="Go to reddit.com/r/python, find the top post today, and summarize it", llm=ChatOpenAI(model="gpt-4o"), ) result = await agent.run() print(result) ``` Set your API key: `export OPENAI_API_KEY=sk-...` (or use Anthropic, Gemini, etc.)
## Introduction Browser Use is an **open-source library that makes websites accessible to AI agents**. It bridges the gap between LLMs and real web browsers, enabling agents to autonomously navigate pages, fill forms, click buttons, extract data, and complete multi-step workflows. Core capabilities: - **Vision + HTML Extraction** — Combines visual understanding with DOM analysis for robust element detection, even on complex dynamic pages - **Multi-Tab Management** — Agents can open, switch between, and manage multiple browser tabs simultaneously - **Automatic Error Recovery** — Self-correcting agents that handle popups, CAPTCHAs, and unexpected page states - **Parallel Agents** — Run multiple browser agents concurrently for batch processing tasks - **Custom Actions** — Define reusable browser actions (save to file, send notification, call API) that agents can invoke - **LLM Agnostic** — Works with OpenAI, Anthropic Claude, Google Gemini, DeepSeek, and any LangChain-compatible model - **Session Persistence** — Connect to existing Chrome sessions with cookies and login state preserved 70,000+ GitHub stars. Used for web scraping, form automation, testing, data collection, and building autonomous web agents. ## FAQ **Q: How does Browser Use differ from Playwright or Selenium?** A: Playwright/Selenium require you to write explicit selectors and step-by-step scripts. Browser Use lets you describe tasks in natural language, and the AI agent figures out how to interact with the page autonomously. **Q: Does it work with sites that require login?** A: Yes. You can either let the agent log in with credentials, or connect to an existing Chrome session where you're already authenticated. **Q: Can I run it headless (no visible browser)?** A: Yes. Pass `headless=True` to the Browser config. This is useful for server-side automation and CI/CD pipelines. **Q: How much does it cost to run?** A: Cost depends on the LLM provider. Each page interaction typically uses 1-3K tokens. A typical 10-step task costs ~$0.05-0.15 with GPT-4o. ## Works With - OpenAI / Anthropic / Google / DeepSeek / any LangChain LLM - Playwright (Chromium) for browser control - Python 3.11+ async/await - Docker for containerized deployment
🙏

Source & Thanks

- GitHub: [browser-use/browser-use](https://github.com/browser-use/browser-use) - License: MIT - Stars: 70,000+ - Maintainer: Browser Use team Thanks to the Browser Use team for creating the most popular open-source browser automation framework for AI agents, making the web programmable through natural language.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets