Scripts2026年6月1日·1 分钟阅读

LLMFit — Find Which LLM Runs on Your Hardware

A Rust CLI that scans your system specs and matches them against hundreds of models and providers to tell you what you can run locally.

Agent 就绪

Agent 可直接安装

这个资产可安装;Agent 先选择当前运行时、检查安装计划,再运行匹配命令。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
LLMFit Overview
直接安装命令
npx -y tokrepo@latest install a73d28f2-5df6-11f1-9bc6-00163e2b0d79 --target codex

先 dry-run 确认安装计划,再运行此命令。

Introduction

LLMFit is a single-command Rust CLI that inspects your CPU, GPU, and RAM, then tells you exactly which large language models you can run locally. It supports hundreds of models across GGUF, SafeTensors, MLX, and Unsloth formats, removing the guesswork from local AI deployment.

What LLMFit Does

  • Detects available GPU VRAM, system RAM, and compute capabilities automatically
  • Matches hardware profile against a curated registry of models and providers
  • Recommends quantization levels (Q4, Q5, Q8, FP16) that fit within your memory budget
  • Supports NVIDIA CUDA, Apple Metal/MLX, AMD ROCm, and CPU-only setups
  • Outputs results as a ranked table or JSON for scripting

Architecture Overview

LLMFit is a single statically-linked Rust binary with zero runtime dependencies. On launch it probes GPU and system info via platform APIs (CUDA, Metal, sysinfo), loads its model registry from an embedded catalog (updated via llmfit update), and runs a constraint solver to match model memory requirements against available resources. Results are streamed to stdout in either a human-readable table or structured JSON.

Self-Hosting & Configuration

  • Install via cargo install llmfit or download a prebuilt binary from GitHub Releases
  • No server component or daemon required — purely a local CLI tool
  • Update the model registry: llmfit update
  • Override detected VRAM with --vram 24GB for planning on different hardware
  • Filter results by provider, format, or model family with CLI flags

Key Features

  • Single binary, zero dependencies — runs on Linux, macOS, and Windows
  • Covers GGUF (llama.cpp), SafeTensors (Hugging Face), MLX (Apple), and Unsloth formats
  • Hardware auto-detection for NVIDIA, AMD, Apple Silicon, and CPU
  • JSON output mode for CI/CD pipeline integration
  • Frequently updated model catalog with community contributions

Comparison with Similar Tools

  • Ollama — runtime that pulls and serves models; LLMFit only advises what fits, does not serve
  • GPT4All — bundled desktop app with limited model selection; LLMFit covers broader registries
  • LM Studio — GUI-based model browser; LLMFit is headless and scriptable
  • candle — Rust inference library; LLMFit is a recommendation tool, not an inference engine

FAQ

Q: Does LLMFit download or run models? A: No. It only scans hardware and recommends compatible models. You still need a runtime like llama.cpp or Ollama to actually run them.

Q: How often is the model catalog updated? A: The embedded catalog ships with each release. Run llmfit update to pull the latest catalog between releases.

Q: Does it support multi-GPU setups? A: Yes. LLMFit detects all available GPUs and can recommend models that fit across combined VRAM.

Q: Is it free and open source? A: Yes. LLMFit is MIT-licensed and fully open source.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产