# Superduper — End-to-End AI Application Framework on Your Database > An open-source Python framework for building AI applications directly on existing databases, integrating vector search, LLM inference, and RAG without moving data. ## Install Save as a script file and run: # Superduper — End-to-End AI Application Framework on Your Database ## Quick Use ```bash pip install superduper from superduper import superduper db = superduper("mongodb://localhost:27017/mydb") db.apply(VectorIndex(indexing_listener=model)) ``` ## Introduction Superduper is an open-source framework that brings AI capabilities directly to your existing database. Instead of extracting data into separate ML pipelines, Superduper lets you apply models, embeddings, and LLM-powered features as database-native operations, keeping your data in place while adding intelligence on top. ## What Superduper Does - Applies ML models and LLMs directly to database records - Creates vector indexes for semantic search without a separate vector DB - Builds RAG pipelines that query your existing data stores - Triggers model inference automatically when new data arrives - Supports MongoDB, PostgreSQL, MySQL, SQLite, and S3 as backends ## Architecture Overview Superduper wraps your database connection with an AI-aware layer. Models register as listeners on collections or tables, executing automatically on inserts and updates. Vector indexes are maintained alongside regular data using the database's native storage. A scheduler coordinates batch and real-time inference, while a compute backend (local, Ray, or Dask) handles parallel execution. ## Self-Hosting & Configuration - Install via pip with optional extras for your database backend - Connect by passing your existing database URI to the superduper() function - Models are defined as Python classes or imported from Hugging Face - Configure compute backends for distributed processing in YAML - Supports Docker Compose for running all components together ## Key Features - Database-native vector search eliminates the need for a separate vector store - Change-data-capture triggers keep AI outputs fresh as data changes - Multi-model pipelines chain embeddings, classifiers, and LLMs - Version control for models and outputs enables reproducibility - Works with both SQL and document databases without code changes ## Comparison with Similar Tools - **LangChain** — orchestration framework; Superduper is database-first, not chain-first - **Pinecone/Weaviate** — standalone vector DBs; Superduper adds vectors to your existing DB - **MindsDB** — SQL-based AI queries; Superduper offers richer Python model integration - **Feature stores (Feast)** — batch feature serving; Superduper does real-time model application ## FAQ **Q: Do I need to migrate my data to use Superduper?** A: No. Superduper connects to your existing database and operates in-place. **Q: Which embedding models are supported?** A: Any model from Hugging Face, OpenAI, Cohere, or custom PyTorch/TensorFlow models. **Q: Can I use it for production workloads?** A: Yes. Superduper supports distributed compute via Ray and is designed for production data volumes. **Q: How does vector search performance compare to dedicated vector databases?** A: For most use cases, performance is comparable. Dedicated vector DBs may be faster at very large scale (100M+ vectors). ## Sources - https://github.com/superduper-io/superduper - https://docs.superduper.io/ --- Source: https://tokrepo.com/en/workflows/asset-ea8d6896 Author: Script Depot