ScriptsMay 7, 2026·4 min read

Open Interpreter OS Mode — Natural-Language Computer Control

Open Interpreter OS Mode adds full computer control via screenshots + clicks. Drives any GUI app — terminal, browser, Photoshop — with natural language.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Needs Confirmation · 66/100Policy: confirm
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: New
Entrypoint
Asset
Universal CLI install command
npx tokrepo install ad989392-1904-409f-b81c-11689a4c3313
Intro

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Quick Use

  1. pip install open-interpreter
  2. interpreter --os (the OS-control flag)
  3. Type natural-language commands; confirm prompts before destructive ops

Intro

Open Interpreter's OS Mode extends the natural-language CLI into full computer control. The agent takes screenshots, sees the screen, and drives any GUI app via clicks, keystrokes, and shell commands. Best for: research / experimentation / one-off automation that touches GUI apps you can't script (Photoshop, Excel macros, Zoom, third-party desktop tools). Works with: macOS, Windows, Linux. Setup time: 5 minutes.


Install + start

pip install open-interpreter
interpreter --os

The first run prompts for your LLM API key (OpenAI default; use --model claude-3-5-sonnet-20241022 for Claude).

Sample session

> Open Photoshop, create a new 1200x630 document, fill it with a navy-to-orange
  gradient, and add the text "Q3 Report" centered in white.

[OS Mode takes a screenshot, identifies the dock, clicks Photoshop]
[Wait for app launch...]
[Clicks File > New, types dimensions, creates document]
[Selects gradient tool, picks colors, drags from top-left to bottom-right]
[Adds text layer, types "Q3 Report", aligns center]
> Done. Want me to save the file?

Safety prompts

OS Mode asks for confirmation before destructive actions:

About to: Empty Trash (irreversible). Confirm? [y/N]

You can preset auto-approve for whitelisted tools:

interpreter --os --auto_run --safe_mode high

safe_mode high rejects file deletion, network calls to unknown hosts, and shell commands containing rm, dd, etc.

When NOT to use OS Mode

  • Production automation — use Browser Use (browser only) or platform APIs instead
  • Time-critical work — OS Mode latency is ~5-15s per click
  • Anything sensitive — the screenshots leave your machine to the LLM

OS Mode shines for one-off, exploratory, "I'd rather describe this than learn the GUI" tasks.


FAQ

Q: Is Open Interpreter free? A: Yes — Apache-2.0 open-source. You bring your own LLM API key (OpenAI / Anthropic / local). Inference cost depends on the model and how visual the task is (vision-capable models are more expensive).

Q: How does this differ from Browser Use? A: Browser Use is browser-only (clicks inside Chrome). OS Mode is whole-OS (any GUI app, terminal, dock). Use Browser Use for web scraping; OS Mode for desktop-app automation. Different latency and reliability profiles.

Q: Will it work on a remote server? A: Limited — OS Mode needs a screen and input devices. For headless / server contexts, use Open Interpreter's standard mode (no --os) which is shell-only and works on any Linux box.


Source & Thanks

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

🙏

Source & Thanks

Built by Open Interpreter. Licensed under Apache-2.0.

OpenInterpreter/open-interpreter — ⭐ 60,000+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets