What is Chroma?
Chroma is a lightweight, open-source embedding database designed for AI applications. It handles embedding generation, storage, and retrieval in one package. Start with in-memory for prototyping, switch to persistent storage for production — no infrastructure changes needed. First-class integrations with LangChain, LlamaIndex, and OpenAI.
Answer-Ready: Chroma is an open-source embedding database for AI. Auto-generates embeddings, stores and queries vectors with zero config. In-memory, local file, or client-server modes. Native LangChain/LlamaIndex integration. Simplest path from prototype to production RAG. 16k+ GitHub stars.
Best for: Developers building RAG prototypes that need to scale. Works with: LangChain, LlamaIndex, OpenAI, any embedding model. Setup time: Under 1 minute.
Core Features
1. Three Deployment Modes
# In-memory (prototyping)
client = chromadb.Client()
# Local persistent (single-user)
client = chromadb.PersistentClient(path="./db")
# Client-server (production)
# Server: chroma run --path ./db --port 8000
client = chromadb.HttpClient(host="localhost", port=8000)2. Auto-Embedding
# Chroma embeds text automatically with default model
collection.add(documents=["Hello world"], ids=["1"])
# Or bring your own embeddings
collection.add(
embeddings=[[0.1, 0.2, 0.3, ...]],
documents=["Hello world"],
ids=["1"],
)
# Or use custom embedding function
from chromadb.utils import embedding_functions
openai_ef = embedding_functions.OpenAIEmbeddingFunction(api_key="sk-...")
collection = client.create_collection("docs", embedding_function=openai_ef)3. Metadata Filtering
results = collection.query(
query_texts=["AI tools"],
n_results=5,
where={"category": "development"},
where_document={"$contains": "Python"},
)4. LangChain Integration
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
vectorstore = Chroma.from_documents(
documents=docs,
embedding=OpenAIEmbeddings(),
persist_directory="./chroma_db",
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})Chroma vs Alternatives
| Feature | Chroma | Qdrant | Pinecone | FAISS |
|---|---|---|---|---|
| Self-hosted | Yes | Yes | No | Yes |
| Auto-embedding | Yes | No | No | No |
| Zero config | Yes | Docker needed | Account needed | Code needed |
| Metadata filter | Yes | Advanced | Yes | No |
| Managed cloud | Yes | Yes | Yes | No |
| Best for | Prototyping → prod | Production scale | Managed scale | Research |
FAQ
Q: How does it scale? A: Client-server mode supports millions of embeddings. For billions, consider Qdrant or Pinecone.
Q: Is auto-embedding good enough? A: Default model (all-MiniLM-L6-v2) is decent for English. For production, use OpenAI or Cohere embeddings.
Q: Can I use it with Claude? A: Yes, store Claude's outputs as embeddings for retrieval, or use Chroma as context for Claude queries.