Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMar 31, 2026·2 min de lecture

MLC-LLM — Universal LLM Deployment Engine

Deploy any LLM on any hardware — phones, browsers, GPUs, CPUs. Compiles models for native performance on iOS, Android, WebGPU, CUDA, Metal, and Vulkan. 22K+ stars.

Introduction

MLC-LLM is a universal deployment engine that runs large language models natively on any hardware. Using ML compilation (Apache TVM), it compiles LLMs for optimized inference on iOS, Android, WebGPU (browsers), CUDA, Metal, Vulkan, and CPUs. Run Llama, Mistral, Phi, Gemma, and other models at native speed everywhere — from phones to servers. 22,000+ GitHub stars, Apache 2.0.

Best for: Deploying LLMs on edge devices, mobile apps, browsers, and custom hardware Works with: Llama 3, Mistral, Phi, Gemma, Qwen, StableLM, and 50+ models


Key Features

Universal Hardware Support

Platform Backend
NVIDIA GPU CUDA
Apple Silicon Metal
AMD GPU Vulkan/ROCm
Browsers WebGPU
iOS Metal + Core ML
Android Vulkan + OpenCL
CPU x86 / ARM

WebLLM (Browser)

Run LLMs entirely in the browser with WebGPU — no server needed:

import { CreateMLCEngine } from "@mlc-ai/web-llm";
const engine = await CreateMLCEngine("Llama-3-8B-Instruct-q4f16_1-MLC");
const reply = await engine.chat.completions.create({
  messages: [{ role: "user", content: "Hello!" }],
});

Quantization

4-bit and 3-bit quantization for running large models on consumer hardware.

OpenAI-Compatible API

REST server with OpenAI-compatible endpoints — drop-in replacement.

Native Performance

ML compilation optimizes models for each specific hardware target, achieving near-optimal throughput.


FAQ

Q: What is MLC-LLM? A: A universal LLM deployment engine that compiles models for native performance on any hardware — phones, browsers, GPUs, and CPUs. 22K+ stars, Apache 2.0.

Q: Can I run Llama 3 on my iPhone? A: Yes, MLC-LLM compiles Llama 3 (quantized) for iOS with Metal acceleration. There's an iOS app available.


🙏

Source et remerciements

Created by MLC AI. Licensed under Apache 2.0. mlc-ai/mlc-llm — 22,000+ GitHub stars

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires