What is Ollama — Run LLMs Locally with One Command?

Run Llama 3, Mistral, Gemma, Phi, and 100+ open-source LLMs locally with a single command. OpenAI-compatible API for seamless integration with AI tools. 120,000+ GitHub stars.

Is Ollama — Run LLMs Locally with One Command free to use?

Yes. Ollama — Run LLMs Locally with One Command is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Ollama — Run LLMs Locally with One Command?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Ollama — Run LLMs Locally with One Command

Popular Models

Model	Size	Best For
`llama3.1`	8B / 70B	General purpose, coding
`mistral`	7B	Fast, multilingual
`codestral`	22B	Code generation
`gemma2`	9B / 27B	Compact, efficient
`phi3`	3.8B / 14B	Small device deployment
`qwen2.5`	7B / 72B	Multilingual, math
`deepseek-coder`	6.7B / 33B	Code completion
`llava`	7B / 13B	Vision + text

ollama pull llama3.1:70b    # Download 70B model
ollama pull codestral       # Code-specialized model
ollama list                 # See installed models

OpenAI-Compatible API

Point any OpenAI SDK client to http://localhost:11434/v1:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
response = client.chat.completions.create(
    model="llama3.1",
    messages=[{"role": "user", "content": "Write a Python function"}]
)

Use with AI Tools

Continue (VS Code):

{"models": [{"title": "Llama", "provider": "ollama", "model": "llama3.1"}]}

LiteLLM proxy:

litellm --model ollama/llama3.1

LangChain:

from langchain_community.llms import Ollama
llm = Ollama(model="llama3.1")

Custom Modelfiles

Create custom models with system prompts and parameters:

FROM llama3.1
SYSTEM "You are a senior Python developer. Always write type-hinted, well-tested code."
PARAMETER temperature 0.3
PARAMETER num_ctx 8192

ollama create my-coder -f Modelfile
ollama run my-coder

Key Stats

120,000+ GitHub stars
100+ available models
OpenAI-compatible API
Runs on macOS, Linux, Windows
GPU acceleration (NVIDIA, Apple Silicon)

FAQ

Q: What is Ollama? A: Ollama is a tool that runs open-source LLMs locally with one command, providing an OpenAI-compatible API for seamless integration with AI development tools.

Q: Is Ollama free? A: Yes, completely free and open-source under MIT license. No API keys or usage fees.

Q: What hardware do I need? A: 8GB RAM for 7B models, 16GB for 13B, 64GB for 70B. Apple Silicon and NVIDIA GPUs are automatically utilized for acceleration.

Ollama — Run LLMs Locally with One Command

Use it first, then decide how deep to go

Popular Models

OpenAI-Compatible API

Use with AI Tools

Custom Modelfiles

Key Stats

FAQ

Source & Thanks

Discussion

Related Assets

Vercel AI SDK — Build AI Apps with React and Next.js

Chroma — Open-Source Embedding Database for AI

Qdrant — Vector Search Engine for AI Applications