What is LM Studio CLI — Run Local LLMs from the Command Line?

The official CLI for LM Studio that lets you download, manage, and serve local language models with an OpenAI-compatible API from your terminal.

Is LM Studio CLI — Run Local LLMs from the Command Line free to use?

Yes. LM Studio CLI — Run Local LLMs from the Command Line is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LM Studio CLI — Run Local LLMs from the Command Line?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

LM Studio CLI — Run Local LLMs from the Command Line

Introduction

LM Studio CLI (lms) is the official command-line interface for LM Studio, providing terminal-native access to downloading, managing, and serving local language models. It exposes an OpenAI-compatible server, making it straightforward to integrate local LLMs into development workflows, scripts, and AI applications without leaving the terminal.

What LM Studio CLI Does

Downloads and manages GGUF and other quantized model files
Starts a local inference server with OpenAI-compatible API
Lists available models from the LM Studio model catalog
Controls the running server (load, unload, status)
Supports hardware acceleration on Apple Silicon, NVIDIA, and AMD GPUs

Architecture Overview

The CLI communicates with the LM Studio runtime daemon running locally. When you start a server, it loads the selected model into GPU or CPU memory using the appropriate backend (MLX on Apple Silicon, llama.cpp on other platforms). The server exposes REST endpoints matching the OpenAI Chat Completions and Embeddings APIs, enabling any OpenAI-compatible client to connect.

Self-Hosting & Configuration

Install via npx (Node.js) or download the standalone binary
Models download to a configurable local directory
Server binds to localhost:1234 by default (configurable)
GPU layers and context length set via command flags or config file
Runs on macOS, Windows, and Linux

Key Features

One-command model download with automatic format detection
OpenAI-compatible API allows drop-in replacement of cloud models
Automatic GPU detection and memory allocation
Supports multiple concurrent models on capable hardware
Structured JSON output mode for scripting and automation

Comparison with Similar Tools

Ollama — similar local LLM serving; LM Studio CLI integrates with the LM Studio desktop ecosystem
llama.cpp server — lower-level; LM Studio CLI adds model management and easier setup
LocalAI — broader model type support; LM Studio CLI focuses on chat and embedding models
GPT4All CLI — similar concept; LM Studio CLI has broader model catalog access

FAQ

Q: Do I need LM Studio desktop app installed? A: The CLI installs the LM Studio runtime automatically. The desktop GUI is optional.

Q: Which model formats are supported? A: GGUF is the primary format. MLX models are supported on Apple Silicon.

Q: Can I use it in CI/CD pipelines? A: Yes. The CLI supports non-interactive mode and can be scripted for automated testing against local models.

Q: How much VRAM do I need? A: Depends on the model. 3B parameter models need roughly 2-3 GB. 7B models need 4-8 GB. CPU inference works with system RAM.

LM Studio CLI — Run Local LLMs from the Command Line

This asset can be read and installed directly by agents

Introduction

What LM Studio CLI Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

ros2ai — ROS 2 CLI Extension with LLMs

youtube-dl — Command-Line Video Downloader for Hundreds of Sites

HTTPie CLI — Modern User-Friendly Command-Line HTTP Client

Yargs — Interactive CLI Argument Parser for Node.js