# Skyvern — AI Visual Browser Automation Agent > Automate any website using LLMs and computer vision. No selectors needed — works on sites never seen before. 21K+ stars. ## Install Copy the content below into your project: # Skyvern — AI Visual Browser Automation Agent ## Quick Use ```bash pip install skyvern skyvern quickstart ``` This launches the Skyvern UI at `http://localhost:8080`. Or use the SDK: ```python from skyvern import Skyvern skyvern = Skyvern(api_key="YOUR_KEY") # Navigate and interact with any website task = skyvern.create_task( url="https://store.example.com", goal="Add the cheapest laptop to cart and proceed to checkout", ) result = skyvern.get_task(task.task_id) print(result.status, result.extracted_data) ``` Playwright SDK (new): ```python from skyvern import SkyvernPlaywright async with SkyvernPlaywright() as skyvern: page = await skyvern.new_page() await page.goto("https://example.com") await page.act("Click on the login button") data = await page.extract("Get the user profile information") ``` --- ## Intro Skyvern is an AI browser automation platform with 21,000+ GitHub stars that combines LLMs with computer vision to automate any website — even ones it has never seen before. Unlike traditional automation tools that rely on brittle CSS selectors and XPaths, Skyvern visually understands web pages and plans actions like a human would. It offers a Python/TypeScript SDK (as a Playwright extension), a no-code workflow builder UI, and a managed cloud with anti-bot handling and proxy rotation. Used in production by companies automating procurement, data entry, and web research at scale. Works with: Any website, OpenAI GPT-4o, Anthropic Claude, Playwright. Best for teams automating complex web workflows that break with traditional selectors. Setup time: under 5 minutes. --- ## Skyvern Architecture & Capabilities ### Three Interaction Modes | Mode | How It Works | Best For | |------|-------------|----------| | **AI Mode** | Pure natural language — LLM + vision decides what to do | Unknown/dynamic websites | | **Selector Mode** | Traditional Playwright CSS/XPath selectors | Known, stable pages | | **AI-Fallback** | Tries selector first, falls back to AI if it fails | Production reliability | ### Core AI Commands (Playwright SDK) ```python # Act — Perform an action described in natural language await page.act("Fill in the email field with test@example.com") await page.act("Click the submit button") await page.act("Select 'Express' from the shipping dropdown") # Extract — Pull structured data from the page pricing = await page.extract("Get all pricing plans with features") # Validate — Check if a condition is met is_logged_in = await page.validate("Is the user currently logged in?") # Prompt — Ask the LLM a question about the page answer = await page.prompt("What payment methods does this site accept?") ``` ### Visual Understanding Skyvern uses computer vision to: - Identify interactive elements (buttons, forms, dropdowns) by appearance - Read and understand page layout without DOM access - Handle CAPTCHAs and visual challenges - Adapt to layout changes automatically ### No-Code Workflow Builder The UI at `localhost:8080` provides: ``` ┌─────────────────────────────────────┐ │ Workflow: Auto-fill Job Application │ ├─────────────────────────────────────┤ │ Step 1: Navigate to job posting │ │ Step 2: Click "Apply Now" │ │ Step 3: Fill personal info │ │ Step 4: Upload resume (PDF) │ │ Step 5: Submit application │ │ Step 6: Extract confirmation # │ └─────────────────────────────────────┘ ``` Build multi-step workflows visually, then run them on schedule or via API. ### Cloud Features The managed cloud offering adds: - **Anti-bot detection** — Bypasses common bot protection systems - **Proxy network** — Automatic IP rotation across regions - **CAPTCHA solving** — AI-powered CAPTCHA completion - **Parallel execution** — Run hundreds of browser sessions simultaneously - **Session recording** — Full video replay of every automation run ### Real-World Use Cases | Use Case | Example | |----------|---------| | **Procurement** | Auto-fill purchase orders across vendor portals | | **Insurance** | Fill out quote forms on multiple carrier sites | | **HR** | Submit job applications across multiple boards | | **Research** | Extract data from government and financial sites | | **Testing** | E2E testing that adapts to UI changes | --- ## FAQ **Q: What is Skyvern?** A: Skyvern is an AI browser automation platform with 21,000+ GitHub stars that uses LLMs and computer vision to automate any website without brittle selectors, offering a Python/TypeScript SDK, no-code UI, and managed cloud. **Q: How is Skyvern different from Stagehand or Browser Use?** A: Skyvern uniquely combines computer vision with LLMs for visual page understanding, offers a no-code workflow builder, and provides a managed cloud with anti-bot handling. Stagehand is a TypeScript-first library; Browser Use is a Python agent framework. Skyvern is best for enterprise automation at scale. **Q: Is Skyvern free?** A: The open-source version (AGPL-3.0) is free to self-host. Skyvern Cloud offers a free tier with paid plans for production scale. --- ## Source & Thanks > Created by [Skyvern-AI](https://github.com/skyvern-ai). Licensed under AGPL-3.0. > > [skyvern](https://github.com/skyvern-ai/skyvern) — ⭐ 21,000+ Thanks to the Skyvern team for bringing visual AI understanding to browser automation. --- ## 快速使用 ```bash pip install skyvern skyvern quickstart ``` 在 `http://localhost:8080` 打开 Skyvern 界面。或使用 SDK: ```python from skyvern import Skyvern skyvern = Skyvern(api_key="YOUR_KEY") task = skyvern.create_task( url="https://store.example.com", goal="将最便宜的笔记本电脑加入购物车", ) ``` --- ## 简介 Skyvern 是一个拥有 21,000+ GitHub stars 的 AI 浏览器自动化平台,结合 LLM 和计算机视觉自动操作任何网站,即使从未见过的网站也能处理。不依赖脆弱的 CSS 选择器,而是像人一样视觉理解网页并规划操作。提供 Python/TypeScript SDK、无代码工作流构建器和托管云服务。 适用于:任何网站、OpenAI GPT-4o、Anthropic Claude、Playwright。适合自动化复杂网页工作流、传统选择器容易失效的团队。 --- ## 核心功能 ### 三种交互模式 - **AI 模式** — 纯自然语言,LLM + 视觉决定操作 - **选择器模式** — 传统 Playwright CSS/XPath 选择器 - **AI 降级模式** — 先用选择器,失败时降级到 AI ### 视觉理解 使用计算机视觉识别交互元素、理解页面布局、处理验证码。 ### 无代码工作流构建器 可视化构建多步骤自动化工作流,定时运行或 API 触发。 ### 云端特性 反爬检测、代理轮转、验证码解决、并行执行、会话录制。 --- ## 来源与感谢 > Created by [Skyvern-AI](https://github.com/skyvern-ai). Licensed under AGPL-3.0. > > [skyvern](https://github.com/skyvern-ai/skyvern) — ⭐ 21,000+ --- Source: https://tokrepo.com/en/workflows/6da285ea-0d45-4bf8-9f34-be39355dc7a7 Author: Agent Toolkit