ScriptsApr 2, 2026·2 min read

Stagehand — AI Browser Automation Framework

Three AI primitives — act(), extract(), observe() — to automate any website with natural language. By Browserbase. 21K+ stars.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

```bash npm install @browserbasehq/stagehand ``` ```typescript import { Stagehand } from "@browserbasehq/stagehand"; const stagehand = new Stagehand({ env: "LOCAL", // or "BROWSERBASE" for cloud modelName: "gpt-4o", modelClientOptions: { apiKey: process.env.OPENAI_API_KEY }, }); await stagehand.init(); await stagehand.page.goto("https://news.ycombinator.com"); // Extract data with natural language const articles = await stagehand.page.extract({ instruction: "Extract the top 5 article titles and their URLs", schema: z.object({ articles: z.array(z.object({ title: z.string(), url: z.string() })), }), }); // Act on the page await stagehand.page.act({ action: "Click the 'More' link at the bottom" }); // Observe available actions const actions = await stagehand.page.observe({ instruction: "What actions can I take on this page?", }); await stagehand.close(); ``` ---
Intro
Stagehand is an AI browser automation framework by Browserbase with 21,800+ GitHub stars. It reduces browser automation to three simple primitives: `act()` to perform actions, `extract()` to pull structured data, and `observe()` to understand page state — all using natural language instructions. Unlike Selenium or Playwright scripts that break when websites change, Stagehand uses AI vision and DOM understanding to adapt automatically. Built on top of Playwright, it supports both local execution and cloud-scale via Browserbase. Works with: OpenAI GPT-4o, Anthropic Claude, any OpenAI-compatible API, Playwright. Best for developers building web scrapers, testing agents, or browser-based AI workflows. Setup time: under 3 minutes. ---
## Stagehand Core Primitives ### act() — Perform Actions Tell the browser what to do in plain English: ```typescript // Click, type, select, scroll — all with natural language await page.act({ action: "Click the sign-in button" }); await page.act({ action: "Type 'hello world' into the search box" }); await page.act({ action: "Select 'United States' from the country dropdown" }); await page.act({ action: "Scroll down to the pricing section" }); ``` ### extract() — Pull Structured Data Extract data with type-safe Zod schemas: ```typescript import { z } from "zod"; const products = await page.extract({ instruction: "Extract all product names, prices, and ratings", schema: z.object({ products: z.array(z.object({ name: z.string(), price: z.string(), rating: z.number(), })), }), }); // Returns typed data: products.products[0].name ``` ### observe() — Understand the Page Discover what's possible on the current page: ```typescript const observations = await page.observe({ instruction: "What interactive elements are on this page?", }); // Returns: list of actions like "Click 'Add to Cart'", "Open dropdown menu" ``` ### Combining Primitives Build complex workflows by chaining primitives: ```typescript // 1. Navigate and observe await page.goto("https://store.example.com"); const actions = await page.observe({ instruction: "Find the search functionality" }); // 2. Search for a product await page.act({ action: "Search for 'wireless headphones'" }); // 3. Extract results const results = await page.extract({ instruction: "Get the first 5 search results", schema: z.object({ items: z.array(z.object({ name: z.string(), price: z.string(), inStock: z.boolean(), })), }), }); // 4. Take action on best result await page.act({ action: `Click on '${results.items[0].name}'` }); ``` ### Local vs Cloud Execution ```typescript // Local — runs Playwright on your machine const local = new Stagehand({ env: "LOCAL" }); // Cloud — runs on Browserbase infrastructure const cloud = new Stagehand({ env: "BROWSERBASE", apiKey: process.env.BROWSERBASE_API_KEY, projectId: process.env.BROWSERBASE_PROJECT_ID, }); ``` ### Playwright Compatibility Stagehand extends Playwright — use standard selectors when you want deterministic control: ```typescript // Mix AI and traditional selectors await page.act({ action: "Accept the cookie banner" }); // AI handles dynamic UI await page.locator("#product-id-123").click(); // Precise selector const text = await page.extract({ // AI extracts instruction: "Get the product description", schema: z.object({ description: z.string() }), }); ``` --- ## FAQ **Q: What is Stagehand?** A: Stagehand is an AI browser automation framework with 21,800+ GitHub stars that provides three primitives — act(), extract(), observe() — for automating websites using natural language, built on Playwright. **Q: How is Stagehand different from Browser Use or Skyvern?** A: Stagehand is a developer-first TypeScript library with three clean primitives, designed to be embedded in applications. Browser Use is a Python framework focused on autonomous agents. Skyvern adds computer vision for visual automation. Stagehand is the most ergonomic for TypeScript developers. **Q: Is Stagehand free?** A: Yes, fully open-source under MIT license. Local execution is free. Cloud execution via Browserbase has a free tier with paid plans for scale. ---
🙏

Source & Thanks

> Created by [Browserbase](https://github.com/browserbase). Licensed under MIT. > > [stagehand](https://github.com/browserbase/stagehand) — ⭐ 21,800+ Thanks to the Browserbase team for creating the most elegant API for AI browser automation.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets