SkillsMar 30, 2026·2 min read

Qdrant — High-Performance Vector Database

Vector database and search engine for AI applications. Handles billion-scale similarity search with filtering, sparse vectors, and multi-tenancy. Rust-powered. 30K+ stars.

AI Open Source · Community

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow

Agent surface

Any MCP/CLI agent

Kind

Skill

Install

Single

Trust

Trust: Established

Entrypoint

Qdrant — High-Performance Vector Database

Direct install command

npx -y tokrepo@latest install 1566710d-f5ed-46da-af8c-757475a10420 --target codex

Run after dry-run confirms the install plan.

TL;DR

Qdrant is a vector database built in Rust for fast similarity search with advanced filtering, sparse vectors, and multi-tenancy.

§01

What it is

Qdrant is a vector database and similarity search engine designed for AI applications. Written in Rust for performance, it stores high-dimensional vectors alongside arbitrary JSON payloads and supports filtered search, sparse vectors, and multi-tenancy. It runs as a standalone server or embedded in your application.

Qdrant targets developers building RAG pipelines, recommendation engines, image search, anomaly detection, or any system that needs fast nearest-neighbor lookup at scale.

§02

How it saves time or tokens

Vector databases are the backbone of retrieval-augmented generation (RAG). Instead of stuffing entire documents into an LLM prompt, you embed and index them in Qdrant, then retrieve only the relevant chunks. This cuts token usage dramatically -- a well-tuned RAG pipeline might send 2,000 tokens of context instead of 20,000, saving both cost and latency.

§03

How to use

Start Qdrant with Docker:

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Create a collection and insert vectors:

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient('localhost', port=6333)

client.create_collection(
    collection_name='docs',
    vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)

client.upsert(
    collection_name='docs',
    points=[
        PointStruct(id=1, vector=[0.1]*384, payload={'title': 'Getting Started'}),
        PointStruct(id=2, vector=[0.2]*384, payload={'title': 'Advanced Usage'}),
    ]
)

Search with filters:

results = client.search(
    collection_name='docs',
    query_vector=[0.15]*384,
    limit=5,
)

§04

Example

Feature	Qdrant
Language	Rust
Quantization	Scalar, Product, Binary
Sparse vectors	Yes
Multi-tenancy	Built-in
Filtering	Payload-based, nested
Sharding	Automatic
API	REST + gRPC

§05

Related on TokRepo

AI tools for RAG -- RAG frameworks and retrieval tools
AI tools for database -- database tools for AI applications

§06

Common pitfalls

Vector dimensionality must match your embedding model. A mismatch (e.g., 384 vs 768) causes silent failures or errors at insert time.
Quantization reduces memory usage significantly but can lower recall. Test with your actual data before enabling in production.
On-disk storage mode is available for large datasets that do not fit in RAM, but search latency increases. Profile your workload to decide between in-memory and on-disk.

Frequently Asked Questions

How does Qdrant compare to Pinecone?+

Qdrant is open source and self-hostable, while Pinecone is a managed cloud service. Qdrant gives you full control over infrastructure and data residency. Pinecone offers zero-ops convenience. Performance is comparable for most workloads. Choose Qdrant if you need self-hosting or want to avoid vendor lock-in.

Can Qdrant handle billions of vectors?+

Yes. Qdrant supports sharding and on-disk storage for large-scale deployments. Billions of vectors require distributed mode with multiple nodes. Quantization (scalar or product) reduces memory per vector, making billion-scale deployments practical on commodity hardware.

Does Qdrant support hybrid search?+

Yes. Qdrant supports both dense and sparse vectors in the same collection. You can combine semantic similarity (dense) with keyword matching (sparse) using named vectors. This hybrid approach often improves retrieval quality compared to dense-only search.

What embedding models work with Qdrant?+

Any embedding model that outputs fixed-dimension vectors works with Qdrant. Common choices include OpenAI text-embedding-3-small (1536d), Cohere embed-v3 (1024d), and sentence-transformers models (384d or 768d). You generate embeddings externally and store the resulting vectors in Qdrant.

Is Qdrant free?+

Qdrant is open source under Apache 2.0 and free to self-host. Qdrant also offers a managed cloud service with a free tier for small projects and paid tiers for production workloads. The self-hosted version has no feature restrictions compared to the cloud version.

Citations (3)

Qdrant GitHub— Qdrant vector database repository and documentation
Qdrant Docs— Qdrant filtering and payload indexing documentation
Qdrant Sparse Vectors Docs— Sparse vector support for hybrid search

Related on TokRepo

RAG tools AI database tools Featured workflows

🙏

Source & Thanks

Created by Qdrant. Licensed under Apache 2.0. qdrant/qdrant — 30,000+ GitHub stars

Discussion

No comments yet. Be the first to share your thoughts.

Related Assets

StarRocks — High-Performance Analytical Database with MySQL Protocol

StarRocks is a next-generation MPP database that delivers extreme analytical query performance on large datasets. Benchmarks frequently show it as the fastest open-source OLAP engine — with full MySQL compatibility and support for data lake queries.

Skills

AI Open Source

Suricata — High-Performance Network IDS, IPS and Security Monitoring

A high-performance open-source network intrusion detection and prevention engine with multi-threaded packet processing and protocol analysis.

Skills

AI Open Source

Memcached — High-Performance Distributed Memory Caching System

Memcached is a free, open-source, high-performance distributed memory object caching system used to speed up dynamic web applications by reducing database load.

Skills

Script Depot

Echo — High Performance Minimalist Go Web Framework

Echo is a high performance, minimalist Go web framework. Clean API, automatic TLS, HTTP/2, data binding, middleware, and group routing. A strong alternative to Gin with excellent documentation and built-in features.

Skills

AI Open Source