Introduction
CozoDB is an embedded database that combines relational tables, graph traversal, and vector similarity search under a single Datalog-based query language called CozoScript. It was designed for applications that need to query across structured, connected, and embedding data in one unified interface, without stitching together separate databases.
What CozoDB Does
- Stores relational data in tables with typed schemas and ACID transactions
- Supports recursive graph traversal queries via Datalog fixed-point evaluation
- Provides built-in HNSW vector indexes for approximate nearest-neighbor search
- Runs as an embedded library in Rust, Python, JavaScript, Java, or as a standalone server
- Allows time-travel queries to read historical data at any past transaction point
Architecture Overview
CozoDB uses a storage engine abstraction that supports multiple backends: SQLite for embedded use, RocksDB for high-throughput workloads, and an in-memory engine for testing. The query engine compiles CozoScript into a Datalog evaluation plan with magic-set optimization. Graph queries use semi-naive evaluation for efficient fixed-point computation, and vector indexes are maintained as special stored relations alongside regular tables.
Self-Hosting & Configuration
- Install the Python client with pip, or add the Rust crate as a Cargo dependency
- Choose a storage backend (SQLite, RocksDB, or memory) when creating the database
- Run as a standalone HTTP server for language-agnostic access via REST API
- Create HNSW vector indexes on relations to enable similarity search
- Configure index parameters like ef_construction and M for vector search tuning
Key Features
- Datalog query language naturally expresses recursive graph patterns and joins
- HNSW vector indexes are first-class citizens alongside relational tables
- Time-travel queries access any historical database state by transaction ID
- Multiple storage backends let you choose between portability and performance
- Embeddable in Rust, Python, Node.js, Java, Swift, and Go with native bindings
Comparison with Similar Tools
- Neo4j — Neo4j is a server-based graph database; CozoDB combines graph, relational, and vector in one embeddable engine
- DuckDB — DuckDB is an analytical SQL engine; CozoDB uses Datalog for recursive and graph queries
- Milvus — Milvus is a dedicated vector database; CozoDB integrates vector search with relational and graph queries
- SQLite — SQLite is a relational database; CozoDB adds graph traversal and vector search on top of SQLite as a backend
- Dgraph — Dgraph uses GraphQL; CozoDB uses Datalog which supports recursive queries natively
FAQ
Q: What is CozoScript? A: CozoScript is CozoDB's query language based on Datalog. It supports relational queries, recursive graph traversal, aggregation, and vector search in a unified syntax.
Q: Can CozoDB replace a traditional SQL database? A: For embedded use cases, yes. CozoDB handles relational workloads and adds graph and vector capabilities. For large-scale OLTP, a dedicated SQL database may be more appropriate.
Q: Does CozoDB support concurrent access? A: Yes. CozoDB supports concurrent readers with serializable write transactions. The RocksDB backend provides the best concurrency performance.
Q: Is CozoDB suitable for AI/RAG applications? A: Yes. The built-in HNSW vector indexes make CozoDB a good fit for retrieval-augmented generation pipelines that also need structured and graph queries.