Scripts2026年4月1日·1 分钟阅读

Sentence Transformers — State-of-the-Art Embeddings

Sentence Transformers computes text embeddings for semantic search, similarity, and reranking. 18.5K+ GitHub stars. 15,000+ pre-trained models, dense/sparse/reranker, multi-lingual. Apache 2.0.

TO
TokRepo精选 · Community
快速使用

先拿来用,再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

# Install
pip install -U sentence-transformers

# Compute embeddings
python -c "
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = ['AI is amazing', 'Vector search is fast', 'I love pizza']
embeddings = model.encode(sentences)
from sentence_transformers.util import cos_sim
print(cos_sim(embeddings[0], embeddings[1]))  # High similarity
print(cos_sim(embeddings[0], embeddings[2]))  # Low similarity
"

介绍

Sentence Transformers is the standard Python framework for computing state-of-the-art text embeddings, enabling semantic search, text similarity, clustering, and reranking. With 18,500+ GitHub stars and Apache 2.0 license, it provides access to 15,000+ pre-trained models on Hugging Face, supports dense, sparse, and reranker embedding models, multi-lingual capabilities, and training support for custom models. Used by thousands of production applications for semantic search and RAG pipelines.

Best for: Developers building semantic search, RAG, text clustering, or similarity applications Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Models: 15,000+ pre-trained on Hugging Face Hub


Key Features

  • 15,000+ models: Pre-trained embeddings on Hugging Face Hub
  • Dense + sparse + reranker: Multiple embedding strategies
  • Semantic search: Encode queries and documents for similarity matching
  • Multi-lingual: Cross-language embedding support
  • Custom training: Fine-tune models on your own data
  • Clustering + paraphrase mining: Built-in applications

FAQ

Q: What is Sentence Transformers? A: The standard embedding framework with 18.5K+ stars. 15,000+ pre-trained models for semantic search, similarity, and reranking. Dense, sparse, and multi-lingual. Apache 2.0.

Q: How do I install it? A: pip install -U sentence-transformers. Then SentenceTransformer('all-MiniLM-L6-v2').encode(texts).


🙏

来源与感谢

Created by UKP Lab. Licensed under Apache 2.0. UKPLab/sentence-transformers — 18,500+ GitHub stars

相关资产