What is LLMFit — Find What Models Run on Your Hardware?

A Rust CLI that scans your system specs and matches them against hundreds of LLM models and providers to tell you exactly what you can run locally.

Is LLMFit — Find What Models Run on Your Hardware free to use?

Yes. LLMFit — Find What Models Run on Your Hardware is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LLMFit — Find What Models Run on Your Hardware?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

LLMFit — Find What Models Run on Your Hardware

Introduction

LLMFit is an open-source Rust CLI that detects your hardware capabilities and recommends which LLM models you can run locally. It eliminates the guesswork of matching GPU VRAM, RAM, and compute power to specific model sizes and quantization levels.

What LLMFit Does

Scans system hardware (GPU VRAM, RAM, CPU cores, disk space)
Matches against a registry of hundreds of models across providers
Recommends optimal quantization formats (GGUF, MLX, GPTQ) per model
Filters by provider compatibility (Ollama, llama.cpp, vLLM, MLX)
Outputs structured JSON for scripting and automation

Architecture Overview

LLMFit is a single Rust binary with no runtime dependencies. It queries system hardware via platform-native APIs (NVML for NVIDIA, Metal for Apple Silicon), then cross-references a bundled model registry that maps each model variant to its memory and compute requirements. The registry is updated via a simple pull mechanism from the upstream repository.

Self-Hosting & Configuration

Install via cargo or download prebuilt binaries from GitHub Releases
No external services or API keys required
Configure custom model registries via TOML files
Supports offline operation with bundled model database
Works on Linux, macOS (Intel and Apple Silicon), and Windows

Key Features

Zero-dependency single binary written in Rust
Supports NVIDIA, AMD, Apple Silicon, and CPU-only configurations
Recommends specific quantization levels per available VRAM
Integrates with Ollama, llama.cpp, MLX, and vLLM ecosystems
Extensible model registry with community contributions

Comparison with Similar Tools

Ollama — runs models but does not pre-assess hardware compatibility
LM Studio — GUI-based model browser without CLI automation
GPT4All — bundled models with limited hardware-aware recommendations
LocalAI — serving platform, not a hardware assessment tool
llama.cpp — inference engine requiring manual model selection

FAQ

Q: Does LLMFit download or run models? A: No. It only scans hardware and recommends models. You use your preferred runtime to actually download and serve them.

Q: How is the model registry kept up to date? A: The registry ships with the binary and can be updated via llmfit update. Community PRs add new models regularly.

Q: Does it support multi-GPU setups? A: Yes. It detects all available GPUs and calculates aggregate VRAM for split-model recommendations.

Q: What quantization formats does it cover? A: GGUF (Q4, Q5, Q8), MLX 4-bit, GPTQ, AWQ, and full-precision variants.

Sources

https://github.com/AlexsJones/llmfit

LLMFit — Find What Models Run on Your Hardware

Introduction

What LLMFit Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discusión

Activos relacionados

Knip — Find Unused Files, Dependencies and Exports in JS/TS Projects

Tonic — Native gRPC Framework for Rust

PlantUML — Generate UML Diagrams from Plain Text