What is vllm-cli — vLLM Model Serving CLI (Python)?

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

Is vllm-cli — vLLM Model Serving CLI (Python) free to use?

Yes. vllm-cli — vLLM Model Serving CLI (Python) is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install vllm-cli — vLLM Model Serving CLI (Python)?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

vllm-cli — vLLM Model Serving CLI (Python)

Main

Start in interactive mode (vllm-cli) when setting up GPUs/profiles, then switch to command-line mode for repeatable automation runs.
Use built-in profiles and shortcuts to codify serving parameters; README shows serve --shortcut and hardware-optimized GPT-OSS profiles.
Treat vLLM install as a separate compatibility step: README warns CUDA kernels must match PyTorch versions and vLLM-CLI won’t install vLLM by default.

Source-backed notes

README documents Python 3.9+ support and multiple install options including pip install vllm-cli and pip install vllm-cli[vllm].
README includes a basic usage snippet: vllm-cli serve --model openai/gpt-oss-20b.
README notes vLLM binary compatibility concerns and recommends uv/conda-style installs for PyTorch/CUDA alignment.

FAQ

Does vllm-cli install vLLM for me?: Not by default — README says vLLM-CLI will not install vLLM or PyTorch unless you use the extra.
What is the first serving command to try?: README shows vllm-cli serve --model openai/gpt-oss-20b as a basic example.
Why does install matter?: README warns vLLM uses pre-compiled CUDA kernels that must match your PyTorch version.

vllm-cli — vLLM Model Serving CLI (Python)

This asset can be read and installed directly by agents

Key facts (verified)

Main

Source-backed notes

FAQ

Source & Thanks

Discussion

Related Assets

OpenAnt — Verified Vuln Pipeline CLI (Go + Python)

agent-browser — AI Agent Browser Automation CLI (2026)

yt-fts — YouTube Full-Text Search CLI

Lemonade — Local AI Server + CLI (Chat/Image/Speech)