# Open Interpreter OS Mode — Natural-Language Computer Control > Open Interpreter OS Mode adds full computer control via screenshots + clicks. Drives any GUI app — terminal, browser, Photoshop — with natural language. ## Install Save as a script file and run: ## Quick Use 1. `pip install open-interpreter` 2. `interpreter --os` (the OS-control flag) 3. Type natural-language commands; confirm prompts before destructive ops --- ## Intro Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes. --- ### Install + start ```bash pip install open-interpreter interpreter --os ``` The first run prompts for your LLM API key (OpenAI default; use `--model claude-3-5-sonnet-20241022` for Claude). ### Sample session ``` > Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange gradient, and add the text "Q3 Report" centered in white. [OS Mode takes a screenshot, identifies the dock, clicks Photoshop] [Wait for app launch...] [Clicks File > New, types dimensions, creates document] [Selects gradient tool, picks colors, drags from top-left to bottom-right] [Adds text layer, types "Q3 Report", aligns center] > Done. Want me to save the file? ``` ### Safety prompts OS Mode asks for confirmation before destructive actions: ``` About to: Empty Trash (irreversible). Confirm? [y/N] ``` You can preset auto-approve for whitelisted tools: ```bash interpreter --os --auto_run --safe_mode high ``` `safe_mode high` rejects file deletion, network calls to unknown hosts, and shell commands containing `rm`, `dd`, etc. ### When NOT to use OS Mode - Production automation — use Browser Use (browser only) or platform APIs instead - Time-critical work — OS Mode latency is ~5-15s per click - Anything sensitive — the screenshots leave your machine to the LLM OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks. --- ### FAQ **Q: Is Open Interpreter free?** A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive). **Q: How does this differ from Browser Use?** A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles. **Q: Will it work on a remote server?** A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no `--os`) which is shell-only and works on any Linux box. --- ## Source & Thanks > Built by [Open Interpreter](https://github.com/OpenInterpreter). Licensed under Apache-2.0. > > [OpenInterpreter/open-interpreter](https://github.com/OpenInterpreter/open-interpreter) — ⭐ 60,000+ --- ## 快速使用 1. `pip install open-interpreter` 2. `interpreter --os`(OS 控制 flag) 3. 输入自然语言指令;破坏性操作前确认 --- ## 简介 Open Interpreter 的 OS Mode 把自然语言 CLI 扩展成完整的电脑控制。agent 截屏看屏幕、用点击、键盘、shell 命令驱动任何 GUI 应用。适合接触你写不了脚本的 GUI 应用(Photoshop、Excel 宏、Zoom、第三方桌面工具)的研究 / 实验 / 一次性自动化。兼容 macOS / Windows / Linux。装机时间 5 分钟。 --- ### 装 + 启动 ```bash pip install open-interpreter interpreter --os ``` 首次运行要你的 LLM API key(默认 OpenAI;用 `--model claude-3-5-sonnet-20241022` 接 Claude)。 ### 示例会话 ``` > Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange gradient, and add the text "Q3 Report" centered in white. [OS Mode 截屏,识别 Dock,点 Photoshop] [等应用启动…] [点 File > New,输入尺寸,创建文档] [选渐变工具,挑颜色,从左上拖到右下] [加文字图层,输入 "Q3 Report",居中对齐] > Done. Want me to save the file? ``` ### 安全提示 OS Mode 在破坏性动作前会请求确认: ``` About to: Empty Trash (irreversible). Confirm? [y/N] ``` 可以预设白名单工具自动批准: ```bash interpreter --os --auto_run --safe_mode high ``` `safe_mode high` 拒绝文件删除、对未知主机的网络请求、含 `rm` / `dd` 的 shell 命令。 ### 什么时候不该用 OS Mode - 生产自动化 —— 用 Browser Use(仅浏览器)或平台 API - 时间敏感 —— OS Mode 每次点击约 5-15s 延迟 - 任何敏感内容 —— 截屏会离开你的机器到 LLM OS Mode 适合一次性、探索性、「我宁愿描述也不想学 GUI」的任务。 --- ### FAQ **Q: Open Interpreter 免费吗?** A: 免费 —— Apache-2.0 开源。用自己的 LLM API key(OpenAI / Anthropic / 本地)。推理成本看模型和任务视觉密度(视觉模型更贵)。 **Q: 跟 Browser Use 啥区别?** A: Browser Use 只在浏览器里(Chrome 内点击)。OS Mode 是全 OS(任何 GUI app、终端、Dock)。爬网用 Browser Use;桌面应用自动化用 OS Mode。延迟和可靠性都不一样。 **Q: 远程服务器能用吗?** A: 受限 —— OS Mode 需要屏幕和输入设备。无头 / 服务器场景用 Open Interpreter 标准模式(不加 `--os`),shell-only,任何 Linux 机器都能跑。 --- ## 来源与感谢 > Built by [Open Interpreter](https://github.com/OpenInterpreter). Licensed under Apache-2.0. > > [OpenInterpreter/open-interpreter](https://github.com/OpenInterpreter/open-interpreter) — ⭐ 60,000+ --- Source: https://tokrepo.com/en/workflows/open-interpreter-os-mode-natural-language-computer-control Author: Open Interpreter