# Stagehand — AI Browser Automation Framework > Three AI primitives — act(), extract(), observe() — to automate any website with natural language. By Browserbase. 21K+ stars. ## Install Save as a script file and run: # Stagehand — AI Browser Automation Framework ## Quick Use ```bash npm install @browserbasehq/stagehand ``` ```typescript import { Stagehand } from "@browserbasehq/stagehand"; const stagehand = new Stagehand({ env: "LOCAL", // or "BROWSERBASE" for cloud modelName: "gpt-4o", modelClientOptions: { apiKey: process.env.OPENAI_API_KEY }, }); await stagehand.init(); await stagehand.page.goto("https://news.ycombinator.com"); // Extract data with natural language const articles = await stagehand.page.extract({ instruction: "Extract the top 5 article titles and their URLs", schema: z.object({ articles: z.array(z.object({ title: z.string(), url: z.string() })), }), }); // Act on the page await stagehand.page.act({ action: "Click the 'More' link at the bottom" }); // Observe available actions const actions = await stagehand.page.observe({ instruction: "What actions can I take on this page?", }); await stagehand.close(); ``` --- ## Intro Stagehand is an AI browser automation framework by Browserbase with 21,800+ GitHub stars. It reduces browser automation to three simple primitives: `act()` to perform actions, `extract()` to pull structured data, and `observe()` to understand page state — all using natural language instructions. Unlike Selenium or Playwright scripts that break when websites change, Stagehand uses AI vision and DOM understanding to adapt automatically. Built on top of Playwright, it supports both local execution and cloud-scale via Browserbase. Works with: OpenAI GPT-4o, Anthropic Claude, any OpenAI-compatible API, Playwright. Best for developers building web scrapers, testing agents, or browser-based AI workflows. Setup time: under 3 minutes. --- ## Stagehand Core Primitives ### act() — Perform Actions Tell the browser what to do in plain English: ```typescript // Click, type, select, scroll — all with natural language await page.act({ action: "Click the sign-in button" }); await page.act({ action: "Type 'hello world' into the search box" }); await page.act({ action: "Select 'United States' from the country dropdown" }); await page.act({ action: "Scroll down to the pricing section" }); ``` ### extract() — Pull Structured Data Extract data with type-safe Zod schemas: ```typescript import { z } from "zod"; const products = await page.extract({ instruction: "Extract all product names, prices, and ratings", schema: z.object({ products: z.array(z.object({ name: z.string(), price: z.string(), rating: z.number(), })), }), }); // Returns typed data: products.products[0].name ``` ### observe() — Understand the Page Discover what's possible on the current page: ```typescript const observations = await page.observe({ instruction: "What interactive elements are on this page?", }); // Returns: list of actions like "Click 'Add to Cart'", "Open dropdown menu" ``` ### Combining Primitives Build complex workflows by chaining primitives: ```typescript // 1. Navigate and observe await page.goto("https://store.example.com"); const actions = await page.observe({ instruction: "Find the search functionality" }); // 2. Search for a product await page.act({ action: "Search for 'wireless headphones'" }); // 3. Extract results const results = await page.extract({ instruction: "Get the first 5 search results", schema: z.object({ items: z.array(z.object({ name: z.string(), price: z.string(), inStock: z.boolean(), })), }), }); // 4. Take action on best result await page.act({ action: `Click on '${results.items[0].name}'` }); ``` ### Local vs Cloud Execution ```typescript // Local — runs Playwright on your machine const local = new Stagehand({ env: "LOCAL" }); // Cloud — runs on Browserbase infrastructure const cloud = new Stagehand({ env: "BROWSERBASE", apiKey: process.env.BROWSERBASE_API_KEY, projectId: process.env.BROWSERBASE_PROJECT_ID, }); ``` ### Playwright Compatibility Stagehand extends Playwright — use standard selectors when you want deterministic control: ```typescript // Mix AI and traditional selectors await page.act({ action: "Accept the cookie banner" }); // AI handles dynamic UI await page.locator("#product-id-123").click(); // Precise selector const text = await page.extract({ // AI extracts instruction: "Get the product description", schema: z.object({ description: z.string() }), }); ``` --- ## FAQ **Q: What is Stagehand?** A: Stagehand is an AI browser automation framework with 21,800+ GitHub stars that provides three primitives — act(), extract(), observe() — for automating websites using natural language, built on Playwright. **Q: How is Stagehand different from Browser Use or Skyvern?** A: Stagehand is a developer-first TypeScript library with three clean primitives, designed to be embedded in applications. Browser Use is a Python framework focused on autonomous agents. Skyvern adds computer vision for visual automation. Stagehand is the most ergonomic for TypeScript developers. **Q: Is Stagehand free?** A: Yes, fully open-source under MIT license. Local execution is free. Cloud execution via Browserbase has a free tier with paid plans for scale. --- ## Source & Thanks > Created by [Browserbase](https://github.com/browserbase). Licensed under MIT. > > [stagehand](https://github.com/browserbase/stagehand) — ⭐ 21,800+ Thanks to the Browserbase team for creating the most elegant API for AI browser automation. --- ## 快速使用 ```bash npm install @browserbasehq/stagehand ``` ```typescript import { Stagehand } from "@browserbasehq/stagehand"; const stagehand = new Stagehand({ env: "LOCAL", modelName: "gpt-4o" }); await stagehand.init(); await stagehand.page.goto("https://example.com"); // 用自然语言提取数据 const data = await stagehand.page.extract({ instruction: "提取页面标题和所有链接", schema: z.object({ title: z.string(), links: z.array(z.string()) }), }); // 用自然语言执行操作 await stagehand.page.act({ action: "点击搜索按钮" }); ``` --- ## 简介 Stagehand 是 Browserbase 开源的 AI 浏览器自动化框架,拥有 21,800+ GitHub stars。它将浏览器自动化简化为三个原语:`act()`(执行操作)、`extract()`(提取数据)和 `observe()`(理解页面),全部使用自然语言指令。基于 Playwright 构建,支持本地执行和 Browserbase 云端扩展。 适用于:OpenAI GPT-4o、Anthropic Claude、Playwright。适合构建网页爬虫、测试代理或浏览器 AI 工作流的 TypeScript 开发者。 --- ## 核心原语 ### act() — 执行操作 用自然语言告诉浏览器做什么:点击、输入、选择、滚动。 ### extract() — 提取数据 用 Zod schema 定义结构化数据,AI 自动提取并返回类型安全的结果。 ### observe() — 理解页面 发现当前页面上可执行的交互操作。 ### Playwright 兼容 扩展 Playwright,AI 和传统选择器可以混合使用。 --- ## 来源与感谢 > Created by [Browserbase](https://github.com/browserbase). Licensed under MIT. > > [stagehand](https://github.com/browserbase/stagehand) — ⭐ 21,800+ --- Source: https://tokrepo.com/en/workflows/5114a013-a144-4020-8611-c38b74968b99 Author: TokRepo精选