# vllm-cli — vLLM Model Serving CLI (Python)

> vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

## Install

Copy the content below into your project:

## Quick Use

```bash
# Recommended flow per README (install vLLM first, then vllm-cli):
uv venv --python 3.12 --seed && source .venv/bin/activate
uv pip install vllm --torch-backend=auto
uv pip install --upgrade vllm-cli
vllm-cli serve --model openai/gpt-oss-20b
```

## Intro

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

**Best for:** Builders who want a menu-driven TUI plus scriptable commands for managing vLLM model servers

**Works with:** Python 3.9+, vLLM installed separately (README notes CUDA/PyTorch compatibility), optional uv/conda workflows

**Setup time:** 15-30 minutes

### Key facts (verified)

- GitHub: 493 stars · 28 forks · pushed 2026-01-25.
- License: MIT · owner avatar + repo URL verified via GitHub API.
- README-backed entrypoint: `pip install vllm-cli`.

## Main

- Start in interactive mode (`vllm-cli`) when setting up GPUs/profiles, then switch to command-line mode for repeatable automation runs.

- Use built-in profiles and shortcuts to codify serving parameters; README shows `serve --shortcut` and hardware-optimized GPT-OSS profiles.

- Treat vLLM install as a separate compatibility step: README warns CUDA kernels must match PyTorch versions and vLLM-CLI won’t install vLLM by default.

### Source-backed notes

- README documents Python 3.9+ support and multiple install options including `pip install vllm-cli` and `pip install vllm-cli[vllm]`.
- README includes a basic usage snippet: `vllm-cli serve --model openai/gpt-oss-20b`.
- README notes vLLM binary compatibility concerns and recommends uv/conda-style installs for PyTorch/CUDA alignment.

### FAQ

- **Does vllm-cli install vLLM for me?**: Not by default — README says vLLM-CLI will not install vLLM or PyTorch unless you use the extra.
- **What is the first serving command to try?**: README shows `vllm-cli serve --model openai/gpt-oss-20b` as a basic example.
- **Why does install matter?**: README warns vLLM uses pre-compiled CUDA kernels that must match your PyTorch version.

## Source & Thanks

> Source: https://github.com/Chen-zexi/vllm-cli
> License: MIT
> GitHub stars: 493 · forks: 28

---

<!-- ZH -->

## Quick Use

```bash
# Recommended flow per README (install vLLM first, then vllm-cli):
uv venv --python 3.12 --seed && source .venv/bin/activate
uv pip install vllm --torch-backend=auto
uv pip install --upgrade vllm-cli
vllm-cli serve --model openai/gpt-oss-20b
```

## Intro

vllm-cli 是用 vLLM 启动模型服务的 CLI；已验证 493★，支持 Python 3.9+，并提供 profiles、shortcuts 以及 `serve --model` 的完整流程说明。

**Best for:** 既想要交互式 TUI，又需要可脚本化命令来管理 vLLM 模型服务的开发者

**Works with:** Python 3.9+；vLLM 建议单独安装（README 强调 CUDA/PyTorch 兼容性）；可配合 uv/conda

**Setup time:** 15-30 minutes

### Key facts (verified)

- GitHub：493 stars · 28 forks；最近更新 2026-01-25。
- 许可证：MIT；作者头像与仓库链接均已通过 GitHub API 复核。
- README 中可对照的入口命令：`pip install vllm-cli`。

## Main

- 初期用交互模式（`vllm-cli`）配置 GPU 与 profiles，跑通后用命令行模式做可复现的自动化启动。

- 用 profiles + shortcuts 固化服务参数：README 提到 `serve --shortcut`，并提供面向 GPT-OSS 的硬件优化 profiles。

- 把 vLLM 安装当作独立的兼容性步骤：README 警告 CUDA kernel 必须匹配 PyTorch 版本，而且 vLLM-CLI 默认不安装 vLLM。

### Source-backed notes

- README 标注支持 Python 3.9+，并给出多种安装方式：`pip install vllm-cli`、`pip install vllm-cli[vllm]` 等。
- README 提供基础用法示例：`vllm-cli serve --model openai/gpt-oss-20b`。
- README 提醒 vLLM 的二进制兼容性问题，并推荐用 uv/conda 方式保证 PyTorch/CUDA 匹配。

### FAQ

- **vllm-cli 会默认帮我装 vLLM 吗？**：不会。README 说明默认不会安装 vLLM/PyTorch（除非使用带 extra 的安装方式）。
- **最先该试哪个服务命令？**：README 的基础示例是 `vllm-cli serve --model openai/gpt-oss-20b`。
- **为什么安装兼容性重要？**：README 警告 vLLM 含预编译 CUDA kernels，必须与 PyTorch 版本匹配。

## Source & Thanks

> Source: https://github.com/Chen-zexi/vllm-cli
> License: MIT
> GitHub stars: 493 · forks: 28


---
Source: https://tokrepo.com/en/workflows/vllm-cli-vllm-model-serving-cli-python
Author: Script Depot