# Weaviate — AI-Native Vector Database

> Open-source vector database for AI applications with built-in vectorization, hybrid search, and multi-tenancy. Supports 1B+ vectors with sub-100ms latency.

## Install

Save in your project root:

## Quick Use

```bash
docker compose up -d
# or
pip install weaviate-client
```

```python
import weaviate
import weaviate.classes as wvc

client = weaviate.connect_to_local()

# Create collection with auto-vectorization
collection = client.collections.create(
    name="Article",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
)

# Add data (auto-vectorized)
collection.data.insert({"title": "AI trends", "body": "Vector databases are..."})

# Semantic search
results = collection.query.near_text(query="machine learning", limit=5)
for obj in results.objects:
    print(obj.properties["title"])
```

## What is Weaviate?

Weaviate is an open-source, AI-native vector database designed for building AI applications at scale. It combines vector search with structured filtering, supports built-in vectorization from 20+ model providers, and handles billions of vectors with sub-100ms query latency.

**Answer-Ready**: Weaviate is an open-source AI-native vector database with built-in vectorization, hybrid search (vector + keyword), multi-tenancy, and support for 1B+ vectors at sub-100ms latency. Used by Stackexchange, Instabase, and Red Hat.

**Best for**: AI teams building RAG, semantic search, or recommendation systems. **Works with**: OpenAI, Cohere, Hugging Face, Claude (via embeddings). **Setup time**: Under 5 minutes with Docker.

## Core Features

### 1. Built-In Vectorization
No need to manage embedding pipelines:

```python
# Weaviate auto-vectorizes on insert
collection = client.collections.create(
    name="Document",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_cohere(),
    generative_config=wvc.config.Configure.Generative.anthropic(),
)
```

Supported providers: OpenAI, Cohere, Hugging Face, Google, AWS, Ollama, and more.

### 2. Hybrid Search
Combine vector similarity with keyword matching:

```python
results = collection.query.hybrid(
    query="AI agent frameworks",
    alpha=0.75,  # 0=keyword only, 1=vector only
    limit=10,
)
```

### 3. Generative Search (RAG)
Query and generate in one step:

```python
results = collection.generate.near_text(
    query="vector database comparison",
    single_prompt="Summarize this article in 2 sentences: {body}",
    limit=3,
)
for obj in results.objects:
    print(obj.generated)  # LLM-generated summary
```

### 4. Multi-Tenancy
Isolate data per tenant for SaaS applications:

```python
collection = client.collections.create(
    name="UserDocs",
    multi_tenancy_config=wvc.config.Configure.multi_tenancy(enabled=True),
)
collection.tenants.create([wvc.tenants.Tenant(name="tenant_A")])
```

### 5. Filtering

```python
results = collection.query.near_text(
    query="machine learning",
    filters=wvc.query.Filter.by_property("category").equal("research")
            & wvc.query.Filter.by_property("year").greater_than(2024),
    limit=5,
)
```

## Deployment Options

| Option | Use Case |
|--------|----------|
| Docker | Local development |
| Weaviate Cloud | Managed production |
| Kubernetes | Self-hosted at scale |
| Embedded | In-process for testing |

## FAQ

**Q: How does it compare to Pinecone?**
A: Weaviate is open-source and self-hostable with built-in vectorization. Pinecone is managed-only and requires external embedding.

**Q: Can it handle production scale?**
A: Yes, supports 1B+ vectors with horizontal scaling and sub-100ms p99 latency.

**Q: Does it support RAG out of the box?**
A: Yes, generative search combines retrieval and LLM generation in a single query.

## Source & Thanks

> Created by [Weaviate](https://github.com/weaviate). Licensed under BSD-3-Clause.
>
> [weaviate/weaviate](https://github.com/weaviate/weaviate) — 12k+ stars

<!-- ZH -->

## 快速使用

```bash
docker compose up -d
pip install weaviate-client
```

Docker 启动，Python 客户端连接即可使用。

## 什么是 Weaviate？

Weaviate 是开源 AI 原生向量数据库，内置向量化、混合搜索和多租户，支持 10 亿+ 向量和亚 100ms 延迟。

**一句话总结**：Weaviate 是开源 AI 向量数据库，内置向量化、混合搜索、RAG 和多租户，支持 10 亿+ 向量。

**适合人群**：构建 RAG、语义搜索或推荐系统的 AI 团队。

## 核心功能

### 1. 内置向量化
支持 OpenAI、Cohere、HuggingFace 等 20+ 模型。

### 2. 混合搜索
向量相似度 + 关键词匹配组合搜索。

### 3. 生成式搜索（RAG）
检索和 LLM 生成一步完成。

### 4. 多租户
SaaS 应用的数据隔离。

## 常见问题

**Q: 和 Pinecone 比较？**
A: Weaviate 开源可自托管，内置向量化。Pinecone 仅托管，需外部嵌入。

**Q: 支持生产规模？**
A: 支持 10 亿+ 向量，水平扩展，p99 延迟 < 100ms。

## 来源与致谢

> [weaviate/weaviate](https://github.com/weaviate/weaviate) — 12k+ stars, BSD-3

---
Source: https://tokrepo.com/en/workflows/c379f063-5da7-44ac-9dfb-88d06bfc6ec2
Author: AI Open Source