Cette page est affichée en anglais. Une traduction française est en cours.
CLI ToolsMay 14, 2026·2 min de lecture

vllm-cli — vLLM Model Serving CLI (Python)

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Native · 94/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Cli
Installation
Bundle
Confiance
Confiance : Established
Point d'entrée
pip install vllm-cli
Commande CLI universelle
npx tokrepo install 40ec8ddf-a76c-5fa0-9d20-f54ab035128d
Introduction

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and serve --model workflows.

Best for: Builders who want a menu-driven TUI plus scriptable commands for managing vLLM model servers

Works with: Python 3.9+, vLLM installed separately (README notes CUDA/PyTorch compatibility), optional uv/conda workflows

Setup time: 15-30 minutes

Key facts (verified)

  • GitHub: 493 stars · 28 forks · pushed 2026-01-25.
  • License: MIT · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: pip install vllm-cli.

Main

  • Start in interactive mode (vllm-cli) when setting up GPUs/profiles, then switch to command-line mode for repeatable automation runs.

  • Use built-in profiles and shortcuts to codify serving parameters; README shows serve --shortcut and hardware-optimized GPT-OSS profiles.

  • Treat vLLM install as a separate compatibility step: README warns CUDA kernels must match PyTorch versions and vLLM-CLI won’t install vLLM by default.

Source-backed notes

  • README documents Python 3.9+ support and multiple install options including pip install vllm-cli and pip install vllm-cli[vllm].
  • README includes a basic usage snippet: vllm-cli serve --model openai/gpt-oss-20b.
  • README notes vLLM binary compatibility concerns and recommends uv/conda-style installs for PyTorch/CUDA alignment.

FAQ

  • Does vllm-cli install vLLM for me?: Not by default — README says vLLM-CLI will not install vLLM or PyTorch unless you use the extra.
  • What is the first serving command to try?: README shows vllm-cli serve --model openai/gpt-oss-20b as a basic example.
  • Why does install matter?: README warns vLLM uses pre-compiled CUDA kernels that must match your PyTorch version.
🙏

Source et remerciements

Source: https://github.com/Chen-zexi/vllm-cli > License: MIT > GitHub stars: 493 · forks: 28

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires