Esta página se muestra en inglés. Una traducción al español está en curso.
ScriptsMay 24, 2026·3 min de lectura

Superduper — End-to-End AI Application Framework on Your Database

An open-source Python framework for building AI applications directly on existing databases, integrating vector search, LLM inference, and RAG without moving data.

Listo para agents

Instalación lista para agent

Este activo puede instalarse después de elegir el runtime, revisar el plan y ejecutar el comando correspondiente.

Native · 98/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Single
Confianza
Confianza: Established
Entrada
Superduper
Comando de instalación directa
npx -y tokrepo@latest install ea8d6896-57ad-11f1-9bc6-00163e2b0d79 --target codex

Ejecutar después de confirmar el plan con dry-run.

Introduction

Superduper is an open-source framework that brings AI capabilities directly to your existing database. Instead of extracting data into separate ML pipelines, Superduper lets you apply models, embeddings, and LLM-powered features as database-native operations, keeping your data in place while adding intelligence on top.

What Superduper Does

  • Applies ML models and LLMs directly to database records
  • Creates vector indexes for semantic search without a separate vector DB
  • Builds RAG pipelines that query your existing data stores
  • Triggers model inference automatically when new data arrives
  • Supports MongoDB, PostgreSQL, MySQL, SQLite, and S3 as backends

Architecture Overview

Superduper wraps your database connection with an AI-aware layer. Models register as listeners on collections or tables, executing automatically on inserts and updates. Vector indexes are maintained alongside regular data using the database's native storage. A scheduler coordinates batch and real-time inference, while a compute backend (local, Ray, or Dask) handles parallel execution.

Self-Hosting & Configuration

  • Install via pip with optional extras for your database backend
  • Connect by passing your existing database URI to the superduper() function
  • Models are defined as Python classes or imported from Hugging Face
  • Configure compute backends for distributed processing in YAML
  • Supports Docker Compose for running all components together

Key Features

  • Database-native vector search eliminates the need for a separate vector store
  • Change-data-capture triggers keep AI outputs fresh as data changes
  • Multi-model pipelines chain embeddings, classifiers, and LLMs
  • Version control for models and outputs enables reproducibility
  • Works with both SQL and document databases without code changes

Comparison with Similar Tools

  • LangChain — orchestration framework; Superduper is database-first, not chain-first
  • Pinecone/Weaviate — standalone vector DBs; Superduper adds vectors to your existing DB
  • MindsDB — SQL-based AI queries; Superduper offers richer Python model integration
  • Feature stores (Feast) — batch feature serving; Superduper does real-time model application

FAQ

Q: Do I need to migrate my data to use Superduper? A: No. Superduper connects to your existing database and operates in-place.

Q: Which embedding models are supported? A: Any model from Hugging Face, OpenAI, Cohere, or custom PyTorch/TensorFlow models.

Q: Can I use it for production workloads? A: Yes. Superduper supports distributed compute via Ray and is designed for production data volumes.

Q: How does vector search performance compare to dedicated vector databases? A: For most use cases, performance is comparable. Dedicated vector DBs may be faster at very large scale (100M+ vectors).

Sources

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados