Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMay 7, 2026·4 min de lecture

Open Interpreter OS Mode — Natural-Language Computer Control

Open Interpreter OS Mode adds full computer control via screenshots + clicks. Drives any GUI app — terminal, browser, Photoshop — with natural language.

Prêt pour agents

Installation avec revue préalable

Cet actif nécessite une revue. Le prompt copié demande un dry-run, affiche les écritures, puis continue seulement après confirmation.

Needs Confirmation · 66/100Policy : confirmer
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Community
Point d'entrée
Asset
Commande avec revue préalable
npx -y tokrepo@latest install ad989392-1904-409f-b81c-11689a4c3313 --target codex

Dry-run d'abord, confirmez les écritures, puis lancez cette commande.

Introduction

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Quick Use

  1. pip install open-interpreter
  2. interpreter --os (the OS-control flag)
  3. Type natural-language commands; confirm prompts before destructive ops

Intro

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Source & Thanks

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

🙏

Source et remerciements

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires