Esta página se muestra en inglés. Una traducción al español está en curso.
ScriptsJul 5, 2026·3 min de lectura

PyAutoGUI — Cross-Platform GUI Automation for Python

Python module for programmatically controlling mouse and keyboard to automate GUI interactions on Windows, macOS, and Linux.

Listo para agents

Instalación con revisión previa

Este activo requiere revisión. El prompt copiado pide dry-run, muestra escrituras y continúa solo tras confirmación.

Needs Confirmation · 66/100Política: confirmar
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Single
Confianza
Confianza: Established
Entrada
PyAutoGUI Overview
Comando con revisión previa
npx -y tokrepo@latest install 9e6ff1e5-7808-11f1-9bc6-00163e2b0d79 --target codex

Primero dry-run, confirma las escrituras y luego ejecuta este comando.

Introduction

PyAutoGUI is a cross-platform Python module that lets you programmatically control the mouse and keyboard. It works on Windows, macOS, and Linux, making it a go-to choice for automating repetitive GUI tasks, building test harnesses, and scripting desktop workflows.

What PyAutoGUI Does

  • Moves the mouse cursor and performs clicks, drags, and scrolls
  • Types text and sends keyboard hotkeys to any application
  • Takes screenshots and locates images on screen for visual automation
  • Displays simple alert, confirm, and prompt dialog boxes
  • Provides a fail-safe mechanism to abort runaway scripts

Architecture Overview

PyAutoGUI wraps platform-specific APIs through backend modules: Win32 on Windows, Quartz on macOS, and Xlib/xdotool on Linux. The high-level API is identical across platforms, so scripts are portable. Image-matching uses Pillow and optional OpenCV for sub-image search on screenshots.

Self-Hosting & Configuration

  • Install with pip; no native compilation needed on most systems
  • On Linux, ensure xdotool and scrot are installed for full functionality
  • Set PAUSE duration between actions with pyautogui.PAUSE for throttling
  • Enable or disable the fail-safe corner abort via pyautogui.FAILSAFE
  • Integrates with virtual display servers (Xvfb) for headless CI automation

Key Features

  • Unified API across Windows, macOS, and Linux with no code changes
  • Image-based locating finds UI elements by screenshot matching
  • Built-in fail-safe: move mouse to corner to instantly abort scripts
  • Tweening functions for smooth, human-like mouse movements
  • Lightweight with minimal dependencies (Pillow for screenshots)

Comparison with Similar Tools

  • Selenium — browser-specific automation; PyAutoGUI works with any desktop app
  • Playwright — web automation with modern APIs but limited to browsers
  • xdotool — Linux-only CLI tool without cross-platform support
  • AutoHotkey — Windows-only scripting language with its own syntax
  • Pydoll — browser automation without WebDriver, not general GUI control

FAQ

Q: Can PyAutoGUI interact with applications running as admin? A: On Windows, the script must also run with elevated privileges to send input to admin windows.

Q: Does it work in headless environments? A: On Linux, you can use Xvfb to create a virtual display. macOS and Windows require an active desktop session.

Q: How accurate is image-based element detection? A: It works well for static UIs. For dynamic or scaled interfaces, confidence thresholds and grayscale matching improve reliability.

Q: Is PyAutoGUI suitable for game automation? A: It can automate simple 2D games but struggles with 3D applications that bypass standard input APIs.

Sources

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados