# Skyvern — AI Visual Browser Automation Agent > Automate any website using LLMs and computer vision. No selectors needed — works on sites never seen before. 21K+ stars. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: # Skyvern — AI Visual Browser Automation Agent ## Quick Use ```bash pip install skyvern skyvern quickstart ``` This launches the Skyvern UI at `http://localhost:8080`. Or use the SDK: ```python from skyvern import Skyvern skyvern = Skyvern(api_key="YOUR_KEY") # Navigate and interact with any website task = skyvern.create_task( url="https://store.example.com", goal="Add the cheapest laptop to cart and proceed to checkout", ) result = skyvern.get_task(task.task_id) print(result.status, result.extracted_data) ``` Playwright SDK (new): ```python from skyvern import SkyvernPlaywright async with SkyvernPlaywright() as skyvern: page = await skyvern.new_page() await page.goto("https://example.com") await page.act("Click on the login button") data = await page.extract("Get the user profile information") ``` --- ## Intro Skyvern is an AI browser automation platform with 21,000+ GitHub stars that combines LLMs with computer vision to automate any website — even ones it has never seen before. Unlike traditional automation tools that rely on brittle CSS selectors and XPaths, Skyvern visually understands web pages and plans actions like a human would. It offers a Python/TypeScript SDK (as a Playwright extension), a no-code workflow builder UI, and a managed cloud with anti-bot handling and proxy rotation. Used in production by companies automating procurement, data entry, and web research at scale. Works with: Any website, OpenAI GPT-4o, Anthropic Claude, Playwright. Best for teams automating complex web workflows that break with traditional selectors. Setup time: under 5 minutes. --- ## Skyvern Architecture & Capabilities ### Three Interaction Modes | Mode | How It Works | Best For | |------|-------------|----------| | **AI Mode** | Pure natural language — LLM + vision decides what to do | Unknown/dynamic websites | | **Selector Mode** | Traditional Playwright CSS/XPath selectors | Known, stable pages | | **AI-Fallback** | Tries selector first, falls back to AI if it fails | Production reliability | ### Core AI Commands (Playwright SDK) ```python # Act — Perform an action described in natural language await page.act("Fill in the email field with test@example.com") await page.act("Click the submit button") await page.act("Select 'Express' from the shipping dropdown") # Extract — Pull structured data from the page pricing = await page.extract("Get all pricing plans with features") # Validate — Check if a condition is met is_logged_in = await page.validate("Is the user currently logged in?") # Prompt — Ask the LLM a question about the page answer = await page.prompt("What payment methods does this site accept?") ``` ### Visual Understanding Skyvern uses computer vision to: - Identify interactive elements (buttons, forms, dropdowns) by appearance - Read and understand page layout without DOM access - Handle CAPTCHAs and visual challenges - Adapt to layout changes automatically ### No-Code Workflow Builder The UI at `localhost:8080` provides: ``` ┌─────────────────────────────────────┐ │ Workflow: Auto-fill Job Application │ ├─────────────────────────────────────┤ │ Step 1: Navigate to job posting │ │ Step 2: Click "Apply Now" │ │ Step 3: Fill personal info │ │ Step 4: Upload resume (PDF) │ │ Step 5: Submit application │ │ Step 6: Extract confirmation # │ └─────────────────────────────────────┘ ``` Build multi-step workflows visually, then run them on schedule or via API. ### Cloud Features The managed cloud offering adds: - **Anti-bot detection** — Bypasses common bot protection systems - **Proxy network** — Automatic IP rotation across regions - **CAPTCHA solving** — AI-powered CAPTCHA completion - **Parallel execution** — Run hundreds of browser sessions simultaneously - **Session recording** — Full video replay of every automation run ### Real-World Use Cases | Use Case | Example | |----------|---------| | **Procurement** | Auto-fill purchase orders across vendor portals | | **Insurance** | Fill out quote forms on multiple carrier sites | | **HR** | Submit job applications across multiple boards | | **Research** | Extract data from government and financial sites | | **Testing** | E2E testing that adapts to UI changes | --- ## FAQ **Q: What is Skyvern?** A: Skyvern is an AI browser automation platform with 21,000+ GitHub stars that uses LLMs and computer vision to automate any website without brittle selectors, offering a Python/TypeScript SDK, no-code UI, and managed cloud. **Q: How is Skyvern different from Stagehand or Browser Use?** A: Skyvern uniquely combines computer vision with LLMs for visual page understanding, offers a no-code workflow builder, and provides a managed cloud with anti-bot handling. Stagehand is a TypeScript-first library; Browser Use is a Python agent framework. Skyvern is best for enterprise automation at scale. **Q: Is Skyvern free?** A: The open-source version (AGPL-3.0) is free to self-host. Skyvern Cloud offers a free tier with paid plans for production scale. --- ## Source & Thanks > Created by [Skyvern-AI](https://github.com/skyvern-ai). Licensed under AGPL-3.0. > > [skyvern](https://github.com/skyvern-ai/skyvern) — ⭐ 21,000+ Thanks to the Skyvern team for bringing visual AI understanding to browser automation. --- Source: https://tokrepo.com/en/workflows/skyvern-ai-visual-browser-automation-agent-6da285ea Author: Agent Toolkit