# Cohere Embed — Multilingual AI Embeddings API > Generate high-quality multilingual embeddings for search and RAG. Cohere Embed v3 supports 100+ languages with specialized modes for documents, queries, and classification. ## Install Paste the prompt below into your AI tool: ## Quick Use ```bash pip install cohere ``` ```python import cohere co = cohere.ClientV2(api_key="...") # Generate embeddings response = co.embed( texts=["What is machine learning?", "How does AI work?"], model="embed-v4.0", input_type="search_document", embedding_types=["float"], ) print(len(response.embeddings.float_[0])) # 1024 dimensions ``` ## What is Cohere Embed? Cohere Embed is a multilingual embedding API that converts text into high-dimensional vectors for semantic search, RAG, and classification. Version 4.0 supports 100+ languages, offers specialized input types (document vs. query), and includes built-in compression for storage efficiency. It consistently ranks among the top embedding models on the MTEB benchmark. **Answer-Ready**: Cohere Embed v4.0 generates multilingual embeddings for search and RAG. Top MTEB benchmark scores, 100+ languages, specialized input types (document/query/classification). Binary and int8 compression for 32x storage savings. Production API with generous free tier. **Best for**: Teams building multilingual search or RAG pipelines. **Works with**: Any vector database, LangChain, LlamaIndex. **Setup time**: Under 2 minutes. ## Core Features ### 1. Input Types ```python # Different modes optimize for different tasks docs = co.embed(texts=[...], input_type="search_document") # For indexing queries = co.embed(texts=[...], input_type="search_query") # For searching classify = co.embed(texts=[...], input_type="classification") # For classification cluster = co.embed(texts=[...], input_type="clustering") # For clustering ``` ### 2. Compression (32x Savings) ```python response = co.embed( texts=["Hello world"], model="embed-v4.0", input_type="search_document", embedding_types=["float", "int8", "ubinary"], ) # float: 1024 x 4 bytes = 4KB per vector # int8: 1024 x 1 byte = 1KB per vector (4x savings) # binary: 1024 / 8 bytes = 128B per vector (32x savings) ``` ### 3. Multilingual (100+ Languages) ```python # Same model handles all languages — no separate models needed texts = [ "What is AI?", # English "AI 是什么?", # Chinese "AIとは何ですか?", # Japanese "Was ist KI?", # German ] response = co.embed(texts=texts, model="embed-v4.0", input_type="search_document") # Cross-lingual similarity works automatically ``` ### 4. Batch Processing ```python # Embed up to 96 texts per request all_embeddings = [] for batch in chunks(documents, 96): response = co.embed(texts=batch, model="embed-v4.0", input_type="search_document") all_embeddings.extend(response.embeddings.float_) ``` ## Cohere Embed vs Alternatives | Model | Dimensions | Languages | MTEB Score | Compression | |-------|-----------|-----------|------------|-------------| | Cohere Embed v4.0 | 1024 | 100+ | Top 3 | float/int8/binary | | OpenAI text-embedding-3-large | 3072 | 50+ | Top 5 | Matryoshka | | Voyage AI v3 | 1024 | 20+ | Top 5 | No | | BGE-M3 (open source) | 1024 | 100+ | Good | No | ## Pricing | Tier | Embeddings/mo | Price | |------|---------------|-------| | Free | 1M | $0 | | Production | Pay-as-you-go | $0.1/M tokens | ## FAQ **Q: How does it compare to OpenAI embeddings?** A: Comparable quality on MTEB, better multilingual support, and built-in binary compression for significant storage savings. **Q: Can I use it with Pinecone/Qdrant/Weaviate?** A: Yes, generate embeddings with Cohere and store in any vector database. **Q: Is there an open-source alternative?** A: BGE-M3 and E5-Mistral are strong open-source options, but require self-hosting. ## Source & Thanks > Created by [Cohere](https://cohere.com). > > [cohere.com/embed](https://cohere.com/embed) — Multilingual embedding API ## 快速使用 ```bash pip install cohere ``` 三行代码生成多语言高质量向量嵌入。 ## 什么是 Cohere Embed? Cohere Embed 是多语言嵌入 API,将文本转为高维向量用于语义搜索、RAG 和分类。支持 100+ 语言,MTEB 排名前列。 **一句话总结**:多语言嵌入 API,100+ 语言,MTEB Top 3,支持 32x 压缩存储,专门的文档/查询/分类模式,免费层 1M/月。 **适合人群**:构建多语言搜索或 RAG 的团队。 ## 核心功能 ### 1. 输入类型 文档、查询、分类、聚类四种模式优化。 ### 2. 压缩 float/int8/binary 三级压缩,最高节省 32x 存储。 ### 3. 多语言 100+ 语言同一模型,跨语言相似度自动生效。 ## 常见问题 **Q: 和 OpenAI 嵌入比?** A: 质量相当,多语言更好,内置压缩省存储。 **Q: 有开源替代?** A: BGE-M3、E5-Mistral,但需自托管。 ## 来源与致谢 > [cohere.com/embed](https://cohere.com/embed) — 多语言嵌入 API --- Source: https://tokrepo.com/en/workflows/dde04e91-9c33-4bbb-9cf6-6604b1681582 Author: Prompt Lab