Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 8, 2026·6 min de lectura

Cherry Studio Knowledge Base — Local RAG with 50+ Formats

Cherry Studio Knowledge Base ingests PDFs, Office docs, Markdown into a local vector index. Query offline, BYOK any LLM. Data stays on your machine.

Introducción

Cherry Studio Knowledge Base lets the desktop app ingest 50+ file formats into a local vector index — PDFs, Word docs, Markdown, EPUB, even web bookmarks. Query offline using your choice of LLM (OpenAI / Claude / Ollama / etc), with retrieval running locally. Best for: privacy-conscious users who want personal RAG without sending docs to a cloud service. Works with: Cherry Studio 1.4+ on macOS / Windows / Linux. Setup time: 5 minutes.


Build a knowledge base

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama: nomic-embed-text, OpenAI: text-embedding-3-small, Voyage AI, etc)
  3. Sidebar → Knowledge → New Knowledge Base
  4. Name it, pick the embedding model, set chunk size (default 1000)
  5. Drag and drop files or paste a folder

Supported formats

Category Formats
Documents PDF, DOCX, DOC, RTF, ODT, EPUB
Office XLSX, CSV, PPTX
Code All text-based source (PY, JS, TS, GO, …)
Web URL list (auto-fetches and chunks)
Markdown MD, MDX
Notebook IPYNB
Plain TXT, LOG

Query the knowledge base in chat

Toggle the knowledge base toggle in any chat. Cherry Studio retrieves top-k relevant chunks per query, prepends them to the LLM prompt with citations.

Configure retrieval

Knowledge Base   Settings:
  Chunk size: 1000 chars
  Chunk overlap: 200 chars
  Top-K: 6 chunks per query
  Rerank: optional (BGE Reranker via Ollama)
  Threshold: 0.6 (cosine similarity floor)

Sync vs local-only

  • Local-only (default): Vector store on disk under ~/Library/Application Support/CherryStudio/...
  • Sync (optional): Push the index to S3-compatible storage (R2, MinIO) for cross-device sync, encrypted with a passphrase only you hold

When to use Cherry Studio KB vs a hosted RAG

Cherry Studio KB Pinecone Assistant / similar hosted
Personal docs, sensitive content Multi-user team docs
Offline access Always-online
One-time payment for the app + your LLM costs Per-query subscription
Limited to single device (or DIY sync) Cross-device by default

FAQ

Q: Is Cherry Studio free? A: Yes — Cherry Studio is open-source under Apache-2.0. The app is free; you bring your own LLM API keys and pay only for inference. Local Ollama models are fully free.

Q: Can it handle large PDFs? A: Yes — large PDFs are chunked at the configured chunk size. A 500-page PDF takes ~1 minute to embed locally with Ollama and produces a few thousand chunks. Search is fast (cosine on a local FAISS-style index).

Q: Does the knowledge base work with images? A: Mostly text-only currently. PDFs with images are OCR'd via the embedded text layer; image-only pages don't get text content. Image search is on the roadmap but not stable in 1.4.


Quick Use

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama nomic-embed-text or OpenAI text-embedding-3-small)
  3. Sidebar → Knowledge → New, drag and drop your docs

Intro

Cherry Studio Knowledge Base lets the desktop app ingest 50+ file formats into a local vector index — PDFs, Word docs, Markdown, EPUB, even web bookmarks. Query offline using your choice of LLM (OpenAI / Claude / Ollama / etc), with retrieval running locally. Best for: privacy-conscious users who want personal RAG without sending docs to a cloud service. Works with: Cherry Studio 1.4+ on macOS / Windows / Linux. Setup time: 5 minutes.


Build a knowledge base

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama: nomic-embed-text, OpenAI: text-embedding-3-small, Voyage AI, etc)
  3. Sidebar → Knowledge → New Knowledge Base
  4. Name it, pick the embedding model, set chunk size (default 1000)
  5. Drag and drop files or paste a folder

Supported formats

Category Formats
Documents PDF, DOCX, DOC, RTF, ODT, EPUB
Office XLSX, CSV, PPTX
Code All text-based source (PY, JS, TS, GO, …)
Web URL list (auto-fetches and chunks)
Markdown MD, MDX
Notebook IPYNB
Plain TXT, LOG

Query the knowledge base in chat

Toggle the knowledge base toggle in any chat. Cherry Studio retrieves top-k relevant chunks per query, prepends them to the LLM prompt with citations.

Configure retrieval

Knowledge Base   Settings:
  Chunk size: 1000 chars
  Chunk overlap: 200 chars
  Top-K: 6 chunks per query
  Rerank: optional (BGE Reranker via Ollama)
  Threshold: 0.6 (cosine similarity floor)

Sync vs local-only

  • Local-only (default): Vector store on disk under ~/Library/Application Support/CherryStudio/...
  • Sync (optional): Push the index to S3-compatible storage (R2, MinIO) for cross-device sync, encrypted with a passphrase only you hold

When to use Cherry Studio KB vs a hosted RAG

Cherry Studio KB Pinecone Assistant / similar hosted
Personal docs, sensitive content Multi-user team docs
Offline access Always-online
One-time payment for the app + your LLM costs Per-query subscription
Limited to single device (or DIY sync) Cross-device by default

FAQ

Q: Is Cherry Studio free? A: Yes — Cherry Studio is open-source under Apache-2.0. The app is free; you bring your own LLM API keys and pay only for inference. Local Ollama models are fully free.

Q: Can it handle large PDFs? A: Yes — large PDFs are chunked at the configured chunk size. A 500-page PDF takes ~1 minute to embed locally with Ollama and produces a few thousand chunks. Search is fast (cosine on a local FAISS-style index).

Q: Does the knowledge base work with images? A: Mostly text-only currently. PDFs with images are OCR'd via the embedded text layer; image-only pages don't get text content. Image search is on the roadmap but not stable in 1.4.


Source & Thanks

Built by kangfenmao. Licensed under Apache-2.0.

CherryHQ/cherry-studio — ⭐ 18,000+

🙏

Fuente y agradecimientos

Built by kangfenmao. Licensed under Apache-2.0.

CherryHQ/cherry-studio — ⭐ 18,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados