What is vllm-cli — vLLM Model Serving CLI (Python)?

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

Is vllm-cli — vLLM Model Serving CLI (Python) free to use?

Yes. vllm-cli — vLLM Model Serving CLI (Python) is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install vllm-cli — vLLM Model Serving CLI (Python)?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

vllm-cli — vLLM Model Serving CLI (Python)

Main

初期用交互模式（vllm-cli）配置 GPU 与 profiles，跑通后用命令行模式做可复现的自动化启动。
用 profiles + shortcuts 固化服务参数：README 提到 serve --shortcut，并提供面向 GPT-OSS 的硬件优化 profiles。
把 vLLM 安装当作独立的兼容性步骤：README 警告 CUDA kernel 必须匹配 PyTorch 版本，而且 vLLM-CLI 默认不安装 vLLM。

Source-backed notes

README 标注支持 Python 3.9+，并给出多种安装方式：pip install vllm-cli、pip install vllm-cli[vllm] 等。
README 提供基础用法示例：vllm-cli serve --model openai/gpt-oss-20b。
README 提醒 vLLM 的二进制兼容性问题，并推荐用 uv/conda 方式保证 PyTorch/CUDA 匹配。

FAQ

vllm-cli 会默认帮我装 vLLM 吗？：不会。README 说明默认不会安装 vLLM/PyTorch（除非使用带 extra 的安装方式）。
最先该试哪个服务命令？：README 的基础示例是 vllm-cli serve --model openai/gpt-oss-20b。
为什么安装兼容性重要？：README 警告 vLLM 含预编译 CUDA kernels，必须与 PyTorch 版本匹配。

vllm-cli — vLLM Model Serving CLI (Python)

这个资产可以被 Agent 直接读取和安装

Key facts (verified)

Main

Source-backed notes

FAQ

来源与感谢

讨论

相关资产

OpenAnt — Verified Vuln Pipeline CLI (Go + Python)

agent-browser — AI Agent Browser Automation CLI (2026)

Lemonade — Local AI Server + CLI (Chat/Image/Speech)

yt-fts — YouTube Full-Text Search CLI