Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMay 7, 2026·4 min de lecture

Open Interpreter OS Mode — Natural-Language Computer Control

Open Interpreter OS Mode adds full computer control via screenshots + clicks. Drives any GUI app — terminal, browser, Photoshop — with natural language.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Needs Confirmation · 66/100Policy : confirmer
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install ad989392-1904-409f-b81c-11689a4c3313
Introduction

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Quick Use

  1. pip install open-interpreter
  2. interpreter --os (the OS-control flag)
  3. Type natural-language commands; confirm prompts before destructive ops

Intro

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Source & Thanks

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

🙏

Source et remerciements

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires