SkillsApr 8, 2026·3 min read

Jan — Run AI Models Locally on Your Desktop

Open-source desktop app to run LLMs offline. Jan supports Llama, Mistral, and Gemma models with one-click download, OpenAI-compatible API, and full privacy.

TL;DR
Jan lets you download and run open-source LLMs locally with a desktop UI and an OpenAI-compatible API, fully offline.
§01

What it is

Jan is an open-source desktop application for running large language models locally on your computer. It provides a ChatGPT-like interface where you browse a model hub, download models (Llama, Mistral, Gemma, and others) with one click, and start chatting immediately. Everything runs on your hardware with no data leaving your machine.

Jan targets developers, researchers, and privacy-conscious users who want to experiment with LLMs without sending data to cloud APIs. It runs on macOS, Windows, and Linux, supporting both CPU and GPU inference.

§02

How it saves time or tokens

Using cloud LLM APIs means paying per token and trusting a third party with your data. Jan eliminates both costs after the initial model download. For experimentation, prototyping, and sensitive data processing, running locally saves API spend entirely. The OpenAI-compatible local API means you can point existing code at localhost:1337 and it works without code changes.

§03

How to use

  1. Download Jan from jan.ai for your platform (Mac, Windows, Linux).
  1. Open Jan, go to the Model Hub, and download a model (e.g., Llama 3.1 8B).
  1. Start chatting in the built-in UI, fully offline.
  1. Optionally, use the local API:
curl http://localhost:1337/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Explain transformers briefly"}]
  }'
§04

Example

from openai import OpenAI

# Point to Jan's local API
client = OpenAI(base_url='http://localhost:1337/v1', api_key='not-needed')

response = client.chat.completions.create(
    model='llama-3.1-8b',
    messages=[{'role': 'user', 'content': 'What is retrieval augmented generation?'}]
)

print(response.choices[0].message.content)

This uses the standard OpenAI Python SDK pointed at your local Jan instance. No API key needed, no data sent externally.

§05

Related on TokRepo

§06

Common pitfalls

  • Large models (70B+ parameters) require significant RAM and VRAM. Check model requirements before downloading. Start with 7B-8B parameter models on consumer hardware.
  • The OpenAI-compatible API listens on localhost by default. If you need network access, configure the bind address carefully and consider authentication.
  • Model download sizes are large (4-50+ GB). Ensure sufficient disk space and a stable connection before starting downloads.

Frequently Asked Questions

What hardware does Jan require?+

Jan runs on any modern computer. For CPU-only inference, 8GB RAM handles 7B models. For GPU acceleration, an NVIDIA GPU with 6GB+ VRAM dramatically improves speed. Apple Silicon Macs use Metal for acceleration.

Is Jan truly private?+

Yes. Jan runs entirely on your local machine. No telemetry, no data sent to external servers. Models are downloaded once and run offline. Your conversations never leave your computer.

How does Jan compare to Ollama?+

Ollama is CLI-first and optimized for developers. Jan provides a full desktop GUI similar to ChatGPT. Both offer OpenAI-compatible APIs. Choose Jan for a visual experience; choose Ollama for terminal workflows.

Can Jan use NVIDIA GPUs?+

Yes. Jan supports NVIDIA CUDA for GPU acceleration. It auto-detects available GPUs and offers GPU layers configuration. AMD ROCm support is also available on Linux.

What model formats does Jan support?+

Jan primarily uses GGUF format models (llama.cpp compatible). The built-in model hub offers pre-configured models. You can also import custom GGUF models from sources like Hugging Face.

Citations (3)
🙏

Source & Thanks

Created by janhq. Licensed under AGPL-3.0.

janhq/jan — 26k+ stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets