What is Chroma — Open-Source Embedding Database for AI?

Lightweight open-source vector database that runs anywhere. Chroma provides in-memory, local file, and client-server modes for embeddings with zero-config LangChain integration.

Is Chroma — Open-Source Embedding Database for AI free to use?

Yes. Chroma — Open-Source Embedding Database for AI is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Chroma — Open-Source Embedding Database for AI?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Chroma — Open-Source Embedding Database for AI

What is Chroma?

Chroma is a lightweight, open-source embedding database designed for AI applications. It handles embedding generation, storage, and retrieval in one package. Start with in-memory for prototyping, switch to persistent storage for production — no infrastructure changes needed. First-class integrations with LangChain, LlamaIndex, and OpenAI.

Answer-Ready: Chroma is an open-source embedding database for AI. Auto-generates embeddings, stores and queries vectors with zero config. In-memory, local file, or client-server modes. Native LangChain/LlamaIndex integration. Simplest path from prototype to production RAG. 16k+ GitHub stars.

Best for: Developers building RAG prototypes that need to scale. Works with: LangChain, LlamaIndex, OpenAI, any embedding model. Setup time: Under 1 minute.

Core Features

1. Three Deployment Modes

# In-memory (prototyping)
client = chromadb.Client()

# Local persistent (single-user)
client = chromadb.PersistentClient(path="./db")

# Client-server (production)
# Server: chroma run --path ./db --port 8000
client = chromadb.HttpClient(host="localhost", port=8000)

2. Auto-Embedding

# Chroma embeds text automatically with default model
collection.add(documents=["Hello world"], ids=["1"])

# Or bring your own embeddings
collection.add(
    embeddings=[[0.1, 0.2, 0.3, ...]],
    documents=["Hello world"],
    ids=["1"],
)

# Or use custom embedding function
from chromadb.utils import embedding_functions
openai_ef = embedding_functions.OpenAIEmbeddingFunction(api_key="sk-...")
collection = client.create_collection("docs", embedding_function=openai_ef)

3. Metadata Filtering

results = collection.query(
    query_texts=["AI tools"],
    n_results=5,
    where={"category": "development"},
    where_document={"$contains": "Python"},
)

4. LangChain Integration

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    persist_directory="./chroma_db",
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

Chroma vs Alternatives

Feature	Chroma	Qdrant	Pinecone	FAISS
Self-hosted	Yes	Yes	No	Yes
Auto-embedding	Yes	No	No	No
Zero config	Yes	Docker needed	Account needed	Code needed
Metadata filter	Yes	Advanced	Yes	No
Managed cloud	Yes	Yes	Yes	No
Best for	Prototyping → prod	Production scale	Managed scale	Research

FAQ

Q: How does it scale? A: Client-server mode supports millions of embeddings. For billions, consider Qdrant or Pinecone.

Q: Is auto-embedding good enough? A: Default model (all-MiniLM-L6-v2) is decent for English. For production, use OpenAI or Cohere embeddings.

Q: Can I use it with Claude? A: Yes, store Claude's outputs as embeddings for retrieval, or use Chroma as context for Claude queries.

Chroma — Open-Source Embedding Database for AI

Use it first, then decide how deep to go

What is Chroma?

Core Features

1. Three Deployment Modes

2. Auto-Embedding

3. Metadata Filtering

4. LangChain Integration

Chroma vs Alternatives

FAQ

Source & Thanks

Discussion

Related Assets

Vercel AI SDK — Build AI Apps with React and Next.js

Qdrant — Vector Search Engine for AI Applications

Gemini CLI — Google AI Coding Agent in Terminal