Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsJun 1, 2026·3 min de lecture

LLMFit — Find Which LLM Runs on Your Hardware

A Rust CLI that scans your system specs and matches them against hundreds of models and providers to tell you what you can run locally.

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
LLMFit Overview
Commande d'installation directe
npx -y tokrepo@latest install a73d28f2-5df6-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

Introduction

LLMFit is a single-command Rust CLI that inspects your CPU, GPU, and RAM, then tells you exactly which large language models you can run locally. It supports hundreds of models across GGUF, SafeTensors, MLX, and Unsloth formats, removing the guesswork from local AI deployment.

What LLMFit Does

  • Detects available GPU VRAM, system RAM, and compute capabilities automatically
  • Matches hardware profile against a curated registry of models and providers
  • Recommends quantization levels (Q4, Q5, Q8, FP16) that fit within your memory budget
  • Supports NVIDIA CUDA, Apple Metal/MLX, AMD ROCm, and CPU-only setups
  • Outputs results as a ranked table or JSON for scripting

Architecture Overview

LLMFit is a single statically-linked Rust binary with zero runtime dependencies. On launch it probes GPU and system info via platform APIs (CUDA, Metal, sysinfo), loads its model registry from an embedded catalog (updated via llmfit update), and runs a constraint solver to match model memory requirements against available resources. Results are streamed to stdout in either a human-readable table or structured JSON.

Self-Hosting & Configuration

  • Install via cargo install llmfit or download a prebuilt binary from GitHub Releases
  • No server component or daemon required — purely a local CLI tool
  • Update the model registry: llmfit update
  • Override detected VRAM with --vram 24GB for planning on different hardware
  • Filter results by provider, format, or model family with CLI flags

Key Features

  • Single binary, zero dependencies — runs on Linux, macOS, and Windows
  • Covers GGUF (llama.cpp), SafeTensors (Hugging Face), MLX (Apple), and Unsloth formats
  • Hardware auto-detection for NVIDIA, AMD, Apple Silicon, and CPU
  • JSON output mode for CI/CD pipeline integration
  • Frequently updated model catalog with community contributions

Comparison with Similar Tools

  • Ollama — runtime that pulls and serves models; LLMFit only advises what fits, does not serve
  • GPT4All — bundled desktop app with limited model selection; LLMFit covers broader registries
  • LM Studio — GUI-based model browser; LLMFit is headless and scriptable
  • candle — Rust inference library; LLMFit is a recommendation tool, not an inference engine

FAQ

Q: Does LLMFit download or run models? A: No. It only scans hardware and recommends compatible models. You still need a runtime like llama.cpp or Ollama to actually run them.

Q: How often is the model catalog updated? A: The embedded catalog ships with each release. Run llmfit update to pull the latest catalog between releases.

Q: Does it support multi-GPU setups? A: Yes. LLMFit detects all available GPUs and can recommend models that fit across combined VRAM.

Q: Is it free and open source? A: Yes. LLMFit is MIT-licensed and fully open source.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires