Skills2026年3月28日·1 分钟阅读

Browser Use — AI Browser Automation

Open-source Python library for AI-driven browser automation. Works with Claude, GPT, and Gemini to fill forms, scrape data, and navigate websites.

Browser Use · Community

Agent 就绪

先审查再安装

这个资产需要先审查。复制的指令会要求 Agent dry-run、列出写入项，确认后再继续。

Needs Confirmation · 66/100策略：需确认

Agent 入口

任意 MCP/CLI Agent

类型

Skill

安装

Single

信任

信任等级：Community

入口

Browser Use — AI Browser Automation

先审查命令

npx -y tokrepo@latest install 52993269-0cbd-49e2-bb6a-54f429a5feab --target codex

先 dry-run，确认写入项后再运行此命令。

TL;DR

Browser Use lets AI models like Claude and GPT control a browser to automate web tasks.

§01

What it is

Browser Use is an open-source Python library that connects large language models to a real browser. It gives AI agents the ability to navigate websites, fill forms, click buttons, extract data, and perform multi-step web tasks. The library supports Claude, GPT, Gemini, and other LLM providers as the reasoning engine behind the browser actions.

Browser Use targets developers building AI agents that need web interaction capabilities, QA engineers automating browser tests with natural language, and teams building web scraping pipelines that adapt to changing page layouts.

§02

How it saves time or tokens

Traditional browser automation (Selenium, Playwright) requires writing explicit selectors and step-by-step scripts that break when pages change. Browser Use delegates the page understanding to an LLM, which reads the DOM and decides what to click, type, or extract. This makes automations more resilient to layout changes and reduces the maintenance burden of selector-based scripts.

The library handles browser state management, screenshot capture for vision models, and action execution automatically, so you write high-level task descriptions instead of low-level browser commands.

§03

How to use

Install Browser Use: pip install browser-use. Install a browser backend like Playwright: playwright install chromium.
Configure your LLM provider (set API keys for OpenAI, Anthropic, or Google).
Define a task in natural language and run the agent. Browser Use opens a browser, interprets the page, and executes the steps.

§04

Example

from browser_use import Agent
from langchain_openai import ChatOpenAI

async def main():
    agent = Agent(
        task='Go to google.com, search for browser-use github, and return the star count',
        llm=ChatOpenAI(model='gpt-4o'),
    )
    result = await agent.run()
    print(result)

The agent opens a browser, navigates to Google, types the search query, clicks the result, reads the star count, and returns it as structured output.

§05

Related on TokRepo

AI tools for browser automation — Compare browser automation solutions
AI tools for web scraping — Data extraction tools for the web

§06

Common pitfalls

Each browser action requires an LLM call, so token costs add up for complex multi-step tasks. Use cheaper models for simple navigation and reserve expensive models for complex reasoning steps.
Some websites block automated browsers. Use headless mode with caution and respect robots.txt and terms of service.
Vision-based page understanding (sending screenshots to the LLM) uses more tokens than DOM-text-based approaches. Choose the interaction mode based on your cost sensitivity.

常见问题

Which LLM providers does Browser Use support?+

Browser Use works with OpenAI (GPT-4o, GPT-4), Anthropic (Claude), Google (Gemini), and any LangChain-compatible model. The LLM is used as the reasoning engine that decides which browser actions to take based on the current page state.

How does Browser Use differ from Playwright or Selenium?+

Playwright and Selenium require explicit CSS/XPath selectors and scripted step sequences. Browser Use uses an LLM to understand the page and decide actions dynamically. This makes it more resilient to page layout changes but costs API tokens per action.

Can Browser Use handle authentication and login flows?+

Yes. Browser Use can navigate to login pages, fill in credentials, and handle multi-step authentication flows. You can provide credentials in the task description or through environment variables. It also supports cookie-based session persistence.

Is Browser Use suitable for production scraping?+

Browser Use works for production use cases where resilience to page changes matters more than raw speed. For high-volume scraping with stable page layouts, traditional Playwright scripts are faster and cheaper since they do not require LLM calls per action.

Does Browser Use work with local LLMs?+

Yes. Any LangChain-compatible model works, including local models served via Ollama or vLLM. However, browser automation tasks require strong reasoning capabilities, so smaller local models may struggle with complex multi-step navigation.

引用来源 (3)

Browser Use GitHub— Open-source Python library for AI-driven browser automation
Browser Use Documentation— Works with Claude, GPT, and Gemini models
LangChain Documentation— LangChain integration for LLM provider flexibility

🙏

来源与感谢

Created by browser-use. Licensed under MIT. browser-use — ⭐ 84,800+ Docs: docs.browser-use.com

Thanks to the Browser Use team for building the leading open-source AI browser automation library.

讨论

登录后参与讨论。

还没有评论，来写第一条吧。

Browser Use — AI Browser Automation

先审查再安装

What it is

How it saves time or tokens

How to use

Example

Related on TokRepo

Common pitfalls

常见问题

引用来源 (3)

TokRepo 相关

来源与感谢

讨论

相关资产

Browser-Use Web UI — Visual AI Browser Automation

Docker Selenium Grid — Containerized Browser Testing at Scale

Animate.css — Cross-Browser CSS Animation Library

WebLLM — High-Performance In-Browser LLM Inference