Cette page est affichée en anglais. Une traduction française est en cours.
KnowledgeMay 8, 2026·6 min de lecture

Cherry Studio Knowledge Base — Local RAG with 50+ Formats

Cherry Studio Knowledge Base ingests PDFs, Office docs, Markdown into a local vector index. Query offline, BYOK any LLM. Data stays on your machine.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Needs Confirmation · 64/100Policy : confirmer
Surface agent
Tout agent MCP/CLI
Type
Knowledge
Installation
Single
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install e8255b25-1bb1-47a8-bff9-ca5a445ce3f1
Introduction

Cherry Studio Knowledge Base lets the desktop app ingest 50+ file formats into a local vector index — PDFs, Word docs, Markdown, EPUB, even web bookmarks. Query offline using your choice of LLM (OpenAI / Claude / Ollama / etc), with retrieval running locally. Best for: privacy-conscious users who want personal RAG without sending docs to a cloud service. Works with: Cherry Studio 1.4+ on macOS / Windows / Linux. Setup time: 5 minutes.


Build a knowledge base

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama: nomic-embed-text, OpenAI: text-embedding-3-small, Voyage AI, etc)
  3. Sidebar → Knowledge → New Knowledge Base
  4. Name it, pick the embedding model, set chunk size (default 1000)
  5. Drag and drop files or paste a folder

Supported formats

Category Formats
Documents PDF, DOCX, DOC, RTF, ODT, EPUB
Office XLSX, CSV, PPTX
Code All text-based source (PY, JS, TS, GO, …)
Web URL list (auto-fetches and chunks)
Markdown MD, MDX
Notebook IPYNB
Plain TXT, LOG

Query the knowledge base in chat

Toggle the knowledge base toggle in any chat. Cherry Studio retrieves top-k relevant chunks per query, prepends them to the LLM prompt with citations.

Configure retrieval

Knowledge Base   Settings:
  Chunk size: 1000 chars
  Chunk overlap: 200 chars
  Top-K: 6 chunks per query
  Rerank: optional (BGE Reranker via Ollama)
  Threshold: 0.6 (cosine similarity floor)

Sync vs local-only

  • Local-only (default): Vector store on disk under ~/Library/Application Support/CherryStudio/...
  • Sync (optional): Push the index to S3-compatible storage (R2, MinIO) for cross-device sync, encrypted with a passphrase only you hold

When to use Cherry Studio KB vs a hosted RAG

Cherry Studio KB Pinecone Assistant / similar hosted
Personal docs, sensitive content Multi-user team docs
Offline access Always-online
One-time payment for the app + your LLM costs Per-query subscription
Limited to single device (or DIY sync) Cross-device by default

FAQ

Q: Is Cherry Studio free? A: Yes — Cherry Studio is open-source under Apache-2.0. The app is free; you bring your own LLM API keys and pay only for inference. Local Ollama models are fully free.

Q: Can it handle large PDFs? A: Yes — large PDFs are chunked at the configured chunk size. A 500-page PDF takes ~1 minute to embed locally with Ollama and produces a few thousand chunks. Search is fast (cosine on a local FAISS-style index).

Q: Does the knowledge base work with images? A: Mostly text-only currently. PDFs with images are OCR'd via the embedded text layer; image-only pages don't get text content. Image search is on the roadmap but not stable in 1.4.


Quick Use

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama nomic-embed-text or OpenAI text-embedding-3-small)
  3. Sidebar → Knowledge → New, drag and drop your docs

Intro

Cherry Studio Knowledge Base lets the desktop app ingest 50+ file formats into a local vector index — PDFs, Word docs, Markdown, EPUB, even web bookmarks. Query offline using your choice of LLM (OpenAI / Claude / Ollama / etc), with retrieval running locally. Best for: privacy-conscious users who want personal RAG without sending docs to a cloud service. Works with: Cherry Studio 1.4+ on macOS / Windows / Linux. Setup time: 5 minutes.


Build a knowledge base

  1. Download Cherry Studio from cherry-ai.com
  2. Settings → Models → add an embedding model (Ollama: nomic-embed-text, OpenAI: text-embedding-3-small, Voyage AI, etc)
  3. Sidebar → Knowledge → New Knowledge Base
  4. Name it, pick the embedding model, set chunk size (default 1000)
  5. Drag and drop files or paste a folder

Supported formats

Category Formats
Documents PDF, DOCX, DOC, RTF, ODT, EPUB
Office XLSX, CSV, PPTX
Code All text-based source (PY, JS, TS, GO, …)
Web URL list (auto-fetches and chunks)
Markdown MD, MDX
Notebook IPYNB
Plain TXT, LOG

Query the knowledge base in chat

Toggle the knowledge base toggle in any chat. Cherry Studio retrieves top-k relevant chunks per query, prepends them to the LLM prompt with citations.

Configure retrieval

Knowledge Base   Settings:
  Chunk size: 1000 chars
  Chunk overlap: 200 chars
  Top-K: 6 chunks per query
  Rerank: optional (BGE Reranker via Ollama)
  Threshold: 0.6 (cosine similarity floor)

Sync vs local-only

  • Local-only (default): Vector store on disk under ~/Library/Application Support/CherryStudio/...
  • Sync (optional): Push the index to S3-compatible storage (R2, MinIO) for cross-device sync, encrypted with a passphrase only you hold

When to use Cherry Studio KB vs a hosted RAG

Cherry Studio KB Pinecone Assistant / similar hosted
Personal docs, sensitive content Multi-user team docs
Offline access Always-online
One-time payment for the app + your LLM costs Per-query subscription
Limited to single device (or DIY sync) Cross-device by default

FAQ

Q: Is Cherry Studio free? A: Yes — Cherry Studio is open-source under Apache-2.0. The app is free; you bring your own LLM API keys and pay only for inference. Local Ollama models are fully free.

Q: Can it handle large PDFs? A: Yes — large PDFs are chunked at the configured chunk size. A 500-page PDF takes ~1 minute to embed locally with Ollama and produces a few thousand chunks. Search is fast (cosine on a local FAISS-style index).

Q: Does the knowledge base work with images? A: Mostly text-only currently. PDFs with images are OCR'd via the embedded text layer; image-only pages don't get text content. Image search is on the roadmap but not stable in 1.4.


Source & Thanks

Built by kangfenmao. Licensed under Apache-2.0.

CherryHQ/cherry-studio — ⭐ 18,000+

🙏

Source et remerciements

Built by kangfenmao. Licensed under Apache-2.0.

CherryHQ/cherry-studio — ⭐ 18,000+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires