Jan — Run AI Models Locally on Your Desktop
Open-source desktop app to run LLMs offline. Jan supports Llama, Mistral, and Gemma models with one-click download, OpenAI-compatible API, and full privacy.
Instalación con revisión previa
Este activo requiere revisión. El prompt copiado pide dry-run, muestra escrituras y continúa solo tras confirmación.
npx -y tokrepo@latest install 1abc2bed-5fef-46fd-9a10-71a639eb26ad --target codexPrimero dry-run, confirma las escrituras y luego ejecuta este comando.
What it is
Jan is an open-source desktop application for running large language models locally on your computer. It provides a ChatGPT-like interface where you browse a model hub, download models (Llama, Mistral, Gemma, and others) with one click, and start chatting immediately. Everything runs on your hardware with no data leaving your machine.
Jan targets developers, researchers, and privacy-conscious users who want to experiment with LLMs without sending data to cloud APIs. It runs on macOS, Windows, and Linux, supporting both CPU and GPU inference.
How it saves time or tokens
Using cloud LLM APIs means paying per token and trusting a third party with your data. Jan eliminates both costs after the initial model download. For experimentation, prototyping, and sensitive data processing, running locally saves API spend entirely. The OpenAI-compatible local API means you can point existing code at localhost:1337 and it works without code changes.
How to use
- Download Jan from jan.ai for your platform (Mac, Windows, Linux).
- Open Jan, go to the Model Hub, and download a model (e.g., Llama 3.1 8B).
- Start chatting in the built-in UI, fully offline.
- Optionally, use the local API:
curl http://localhost:1337/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "llama-3.1-8b",
"messages": [{"role": "user", "content": "Explain transformers briefly"}]
}'
Example
from openai import OpenAI
# Point to Jan's local API
client = OpenAI(base_url='http://localhost:1337/v1', api_key='not-needed')
response = client.chat.completions.create(
model='llama-3.1-8b',
messages=[{'role': 'user', 'content': 'What is retrieval augmented generation?'}]
)
print(response.choices[0].message.content)
This uses the standard OpenAI Python SDK pointed at your local Jan instance. No API key needed, no data sent externally.
Related on TokRepo
- Local LLM Tools -- Compare local LLM runners like Jan, Ollama, and LM Studio
- Local LLM: Jan -- Deep dive into Jan's capabilities
Common pitfalls
- Large models (70B+ parameters) require significant RAM and VRAM. Check model requirements before downloading. Start with 7B-8B parameter models on consumer hardware.
- The OpenAI-compatible API listens on localhost by default. If you need network access, configure the bind address carefully and consider authentication.
- Model download sizes are large (4-50+ GB). Ensure sufficient disk space and a stable connection before starting downloads.
Preguntas frecuentes
Jan runs on any modern computer. For CPU-only inference, 8GB RAM handles 7B models. For GPU acceleration, an NVIDIA GPU with 6GB+ VRAM dramatically improves speed. Apple Silicon Macs use Metal for acceleration.
Yes. Jan runs entirely on your local machine. No telemetry, no data sent to external servers. Models are downloaded once and run offline. Your conversations never leave your computer.
Ollama is CLI-first and optimized for developers. Jan provides a full desktop GUI similar to ChatGPT. Both offer OpenAI-compatible APIs. Choose Jan for a visual experience; choose Ollama for terminal workflows.
Yes. Jan supports NVIDIA CUDA for GPU acceleration. It auto-detects available GPUs and offers GPU layers configuration. AMD ROCm support is also available on Linux.
Jan primarily uses GGUF format models (llama.cpp compatible). The built-in model hub offers pre-configured models. You can also import custom GGUF models from sources like Hugging Face.
Referencias (3)
- Jan GitHub— Open-source desktop app for running LLMs locally
- Jan Documentation— OpenAI-compatible local API
- llama.cpp GGUF spec— GGUF model format for local inference
Relacionados en TokRepo
Fuente y agradecimientos
Discusión
Activos relacionados
Jan — Offline AI Desktop App with Full Privacy
Jan is an open-source ChatGPT alternative that runs LLMs locally with full privacy. 41.4K+ GitHub stars. Desktop app for Windows/macOS/Linux, OpenAI-compatible API, MCP support. Apache 2.0.
LocalAI — Run Any AI Model Locally, No GPU
LocalAI is an open-source AI engine running LLMs, vision, voice, and image models locally. 44.6K+ GitHub stars. OpenAI/Anthropic-compatible API, 35+ backends, MCP, agents. MIT licensed.
Replicate — Run AI Models via Simple API Calls
Cloud platform to run open-source AI models with a simple API. Replicate hosts Llama, Stable Diffusion, Whisper, and thousands of models — no GPU setup or Docker required.
LLMFit — Find What Models Run on Your Hardware
A Rust CLI that scans your system specs and matches them against hundreds of LLM models and providers to tell you exactly what you can run locally.