ScriptsApr 1, 2026·2 min read

Sentence Transformers — State-of-the-Art Embeddings

Sentence Transformers computes text embeddings for semantic search, similarity, and reranking. 18.5K+ GitHub stars. 15,000+ pre-trained models, dense/sparse/reranker, multi-lingual. Apache 2.0.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

# Install
pip install -U sentence-transformers

# Compute embeddings
python -c "
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = ['AI is amazing', 'Vector search is fast', 'I love pizza']
embeddings = model.encode(sentences)
from sentence_transformers.util import cos_sim
print(cos_sim(embeddings[0], embeddings[1]))  # High similarity
print(cos_sim(embeddings[0], embeddings[2]))  # Low similarity
"

Intro

Sentence Transformers is the standard Python framework for computing state-of-the-art text embeddings, enabling semantic search, text similarity, clustering, and reranking. With 18,500+ GitHub stars and Apache 2.0 license, it provides access to 15,000+ pre-trained models on Hugging Face, supports dense, sparse, and reranker embedding models, multi-lingual capabilities, and training support for custom models. Used by thousands of production applications for semantic search and RAG pipelines.

Best for: Developers building semantic search, RAG, text clustering, or similarity applications Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Models: 15,000+ pre-trained on Hugging Face Hub


Key Features

  • 15,000+ models: Pre-trained embeddings on Hugging Face Hub
  • Dense + sparse + reranker: Multiple embedding strategies
  • Semantic search: Encode queries and documents for similarity matching
  • Multi-lingual: Cross-language embedding support
  • Custom training: Fine-tune models on your own data
  • Clustering + paraphrase mining: Built-in applications

FAQ

Q: What is Sentence Transformers? A: The standard embedding framework with 18.5K+ stars. 15,000+ pre-trained models for semantic search, similarity, and reranking. Dense, sparse, and multi-lingual. Apache 2.0.

Q: How do I install it? A: pip install -U sentence-transformers. Then SentenceTransformer('all-MiniLM-L6-v2').encode(texts).


🙏

Source & Thanks

Created by UKP Lab. Licensed under Apache 2.0. UKPLab/sentence-transformers — 18,500+ GitHub stars

Related Assets