# Cohere Embed — Multilingual AI Embeddings API

> Generate high-quality multilingual embeddings for search and RAG. Cohere Embed v3 supports 100+ languages with specialized modes for documents, queries, and classification.

## Install

Paste the prompt below into your AI tool:

## Quick Use

```bash
pip install cohere
```

```python
import cohere

co = cohere.ClientV2(api_key="...")

# Generate embeddings
response = co.embed(
    texts=["What is machine learning?", "How does AI work?"],
    model="embed-v4.0",
    input_type="search_document",
    embedding_types=["float"],
)

print(len(response.embeddings.float_[0]))  # 1024 dimensions
```

## What is Cohere Embed?

Cohere Embed is a multilingual embedding API that converts text into high-dimensional vectors for semantic search, RAG, and classification. Version 4.0 supports 100+ languages, offers specialized input types (document vs. query), and includes built-in compression for storage efficiency. It consistently ranks among the top embedding models on the MTEB benchmark.

**Answer-Ready**: Cohere Embed v4.0 generates multilingual embeddings for search and RAG. Top MTEB benchmark scores, 100+ languages, specialized input types (document/query/classification). Binary and int8 compression for 32x storage savings. Production API with generous free tier.

**Best for**: Teams building multilingual search or RAG pipelines. **Works with**: Any vector database, LangChain, LlamaIndex. **Setup time**: Under 2 minutes.

## Core Features

### 1. Input Types

```python
# Different modes optimize for different tasks
docs = co.embed(texts=[...], input_type="search_document")    # For indexing
queries = co.embed(texts=[...], input_type="search_query")     # For searching
classify = co.embed(texts=[...], input_type="classification")  # For classification
cluster = co.embed(texts=[...], input_type="clustering")       # For clustering
```

### 2. Compression (32x Savings)

```python
response = co.embed(
    texts=["Hello world"],
    model="embed-v4.0",
    input_type="search_document",
    embedding_types=["float", "int8", "ubinary"],
)

# float: 1024 x 4 bytes = 4KB per vector
# int8:  1024 x 1 byte  = 1KB per vector (4x savings)
# binary: 1024 / 8 bytes = 128B per vector (32x savings)
```

### 3. Multilingual (100+ Languages)

```python
# Same model handles all languages — no separate models needed
texts = [
    "What is AI?",           # English
    "AI 是什么？",            # Chinese
    "AIとは何ですか？",       # Japanese
    "Was ist KI?",           # German
]
response = co.embed(texts=texts, model="embed-v4.0", input_type="search_document")
# Cross-lingual similarity works automatically
```

### 4. Batch Processing

```python
# Embed up to 96 texts per request
all_embeddings = []
for batch in chunks(documents, 96):
    response = co.embed(texts=batch, model="embed-v4.0", input_type="search_document")
    all_embeddings.extend(response.embeddings.float_)
```

## Cohere Embed vs Alternatives

| Model | Dimensions | Languages | MTEB Score | Compression |
|-------|-----------|-----------|------------|-------------|
| Cohere Embed v4.0 | 1024 | 100+ | Top 3 | float/int8/binary |
| OpenAI text-embedding-3-large | 3072 | 50+ | Top 5 | Matryoshka |
| Voyage AI v3 | 1024 | 20+ | Top 5 | No |
| BGE-M3 (open source) | 1024 | 100+ | Good | No |

## Pricing

| Tier | Embeddings/mo | Price |
|------|---------------|-------|
| Free | 1M | $0 |
| Production | Pay-as-you-go | $0.1/M tokens |

## FAQ

**Q: How does it compare to OpenAI embeddings?**
A: Comparable quality on MTEB, better multilingual support, and built-in binary compression for significant storage savings.

**Q: Can I use it with Pinecone/Qdrant/Weaviate?**
A: Yes, generate embeddings with Cohere and store in any vector database.

**Q: Is there an open-source alternative?**
A: BGE-M3 and E5-Mistral are strong open-source options, but require self-hosting.

## Source & Thanks

> Created by [Cohere](https://cohere.com).
>
> [cohere.com/embed](https://cohere.com/embed) — Multilingual embedding API

<!-- ZH -->

## 快速使用

```bash
pip install cohere
```

三行代码生成多语言高质量向量嵌入。

## 什么是 Cohere Embed？

Cohere Embed 是多语言嵌入 API，将文本转为高维向量用于语义搜索、RAG 和分类。支持 100+ 语言，MTEB 排名前列。

**一句话总结**：多语言嵌入 API，100+ 语言，MTEB Top 3，支持 32x 压缩存储，专门的文档/查询/分类模式，免费层 1M/月。

**适合人群**：构建多语言搜索或 RAG 的团队。

## 核心功能

### 1. 输入类型
文档、查询、分类、聚类四种模式优化。

### 2. 压缩
float/int8/binary 三级压缩，最高节省 32x 存储。

### 3. 多语言
100+ 语言同一模型，跨语言相似度自动生效。

## 常见问题

**Q: 和 OpenAI 嵌入比？**
A: 质量相当，多语言更好，内置压缩省存储。

**Q: 有开源替代？**
A: BGE-M3、E5-Mistral，但需自托管。

## 来源与致谢

> [cohere.com/embed](https://cohere.com/embed) — 多语言嵌入 API

---
Source: https://tokrepo.com/en/workflows/dde04e91-9c33-4bbb-9cf6-6604b1681582
Author: Prompt Lab