Esta página se muestra en inglés. Una traducción al español está en curso.
CLI ToolsMay 14, 2026·2 min de lectura

vllm-cli — vLLM Model Serving CLI (Python)

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and `serve --model` workflows.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Native · 94/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Cli
Instalación
Bundle
Confianza
Confianza: Established
Entrada
pip install vllm-cli
Comando CLI universal
npx tokrepo install 40ec8ddf-a76c-5fa0-9d20-f54ab035128d
Introducción

vllm-cli is a CLI for serving models with vLLM; verified 493★ with Python 3.9+ and docs for profiles, shortcuts, and serve --model workflows.

Best for: Builders who want a menu-driven TUI plus scriptable commands for managing vLLM model servers

Works with: Python 3.9+, vLLM installed separately (README notes CUDA/PyTorch compatibility), optional uv/conda workflows

Setup time: 15-30 minutes

Key facts (verified)

  • GitHub: 493 stars · 28 forks · pushed 2026-01-25.
  • License: MIT · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: pip install vllm-cli.

Main

  • Start in interactive mode (vllm-cli) when setting up GPUs/profiles, then switch to command-line mode for repeatable automation runs.

  • Use built-in profiles and shortcuts to codify serving parameters; README shows serve --shortcut and hardware-optimized GPT-OSS profiles.

  • Treat vLLM install as a separate compatibility step: README warns CUDA kernels must match PyTorch versions and vLLM-CLI won’t install vLLM by default.

Source-backed notes

  • README documents Python 3.9+ support and multiple install options including pip install vllm-cli and pip install vllm-cli[vllm].
  • README includes a basic usage snippet: vllm-cli serve --model openai/gpt-oss-20b.
  • README notes vLLM binary compatibility concerns and recommends uv/conda-style installs for PyTorch/CUDA alignment.

FAQ

  • Does vllm-cli install vLLM for me?: Not by default — README says vLLM-CLI will not install vLLM or PyTorch unless you use the extra.
  • What is the first serving command to try?: README shows vllm-cli serve --model openai/gpt-oss-20b as a basic example.
  • Why does install matter?: README warns vLLM uses pre-compiled CUDA kernels that must match your PyTorch version.
🙏

Fuente y agradecimientos

Source: https://github.com/Chen-zexi/vllm-cli > License: MIT > GitHub stars: 493 · forks: 28

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados