How do I install Cherry Studio Knowledge Base — Local RAG with 50+ Formats?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cherry Studio Knowledge Base — Local RAG with 50+ Formats

Name: Cherry Studio Knowledge Base — Local RAG with 50+ Formats
Author: Cherry Studio

Quick Use

Download Cherry Studio from cherry-ai.com
Settings → Models → add an embedding model (Ollama nomic-embed-text or OpenAI text-embedding-3-small)
Sidebar → Knowledge → New, drag and drop your docs

Intro

Cherry Studio Knowledge Base lets the desktop app ingest 50+ file formats into a local vector index — PDFs, Word docs, Markdown, EPUB, even web bookmarks. Query offline using your choice of LLM (OpenAI / Claude / Ollama / etc), with retrieval running locally. Best for: privacy-conscious users who want personal RAG without sending docs to a cloud service. Works with: Cherry Studio 1.4+ on macOS / Windows / Linux. Setup time: 5 minutes.

Build a knowledge base

Download Cherry Studio from cherry-ai.com
Settings → Models → add an embedding model (Ollama: nomic-embed-text, OpenAI: text-embedding-3-small, Voyage AI, etc)
Sidebar → Knowledge → New Knowledge Base
Name it, pick the embedding model, set chunk size (default 1000)
Drag and drop files or paste a folder

Supported formats

Category	Formats
Documents	PDF, DOCX, DOC, RTF, ODT, EPUB
Office	XLSX, CSV, PPTX
Code	All text-based source (PY, JS, TS, GO, …)
Web	URL list (auto-fetches and chunks)
Markdown	MD, MDX
Notebook	IPYNB
Plain	TXT, LOG

Query the knowledge base in chat

Toggle the knowledge base toggle in any chat. Cherry Studio retrieves top-k relevant chunks per query, prepends them to the LLM prompt with citations.

Configure retrieval

Knowledge Base → ⚙ Settings:
  Chunk size: 1000 chars
  Chunk overlap: 200 chars
  Top-K: 6 chunks per query
  Rerank: optional (BGE Reranker via Ollama)
  Threshold: 0.6 (cosine similarity floor)

Sync vs local-only

Local-only (default): Vector store on disk under ~/Library/Application Support/CherryStudio/...
Sync (optional): Push the index to S3-compatible storage (R2, MinIO) for cross-device sync, encrypted with a passphrase only you hold

When to use Cherry Studio KB vs a hosted RAG

Cherry Studio KB	Pinecone Assistant / similar hosted
Personal docs, sensitive content	Multi-user team docs
Offline access	Always-online
One-time payment for the app + your LLM costs	Per-query subscription
Limited to single device (or DIY sync)	Cross-device by default

FAQ

Q: Is Cherry Studio free? A: Yes — Cherry Studio is open-source under Apache-2.0. The app is free; you bring your own LLM API keys and pay only for inference. Local Ollama models are fully free.

Q: Can it handle large PDFs? A: Yes — large PDFs are chunked at the configured chunk size. A 500-page PDF takes ~1 minute to embed locally with Ollama and produces a few thousand chunks. Search is fast (cosine on a local FAISS-style index).

Q: Does the knowledge base work with images? A: Mostly text-only currently. PDFs with images are OCR'd via the embedded text layer; image-only pages don't get text content. Image search is on the roadmap but not stable in 1.4.

Source & Thanks

Built by kangfenmao. Licensed under Apache-2.0.

CherryHQ/cherry-studio — ⭐ 18,000+

Cherry Studio Knowledge Base — Local RAG with 50+ Formats

Cet actif peut être lu et installé directement par les agents

Build a knowledge base

Supported formats

Query the knowledge base in chat

Configure retrieval

Sync vs local-only

When to use Cherry Studio KB vs a hosted RAG

FAQ

Quick Use

Intro

Build a knowledge base

Supported formats

Query the knowledge base in chat

Configure retrieval

Sync vs local-only

When to use Cherry Studio KB vs a hosted RAG

FAQ

Source & Thanks

Source et remerciements

Fil de discussion

Actifs similaires

Claude-Mem — Persistent Memory Plugin

Awesome-AI-Memory — Papers & Projects for LLM Memory

Weave — Trace and Debug LLM Apps

Helicone Cache — Cut LLM Spend with Drop-In Response Caching