ConfigsApr 29, 2026·3 min read

fastText — Efficient Text Classification and Embeddings by Meta

Library for efficient learning of word representations and text classification that trains on billions of words in minutes.

Introduction

fastText is a library from Meta AI Research for efficient text classification and word representation learning. It extends the Word2Vec approach with subword information, enabling it to generate embeddings for out-of-vocabulary words and train classifiers on large datasets in seconds rather than hours.

What fastText Does

  • Learns word vectors using subword (character n-gram) information for robust embeddings
  • Trains supervised text classifiers that scale to billions of examples
  • Provides pre-trained word vectors for 157 languages
  • Supports both CBOW and Skip-gram training objectives
  • Offers quantization to compress models by 10x with minimal accuracy loss

Architecture Overview

fastText represents each word as a bag of character n-grams plus the word itself. During training, it learns embeddings for these subword units and composes word vectors by summing them. For classification, it uses a shallow neural network with a linear classifier on top of averaged word embeddings, achieving accuracy competitive with deep models at a fraction of the compute cost. The hierarchical softmax option further speeds up training on datasets with many labels.

Self-Hosting & Configuration

  • Install via pip, conda, or compile from source for C++ CLI tools
  • Pre-trained vectors available for download from the fastText website
  • Training parameters (learning rate, epochs, n-grams) are set via CLI flags
  • Use quantize to reduce model size for deployment on resource-constrained systems
  • The Python API wraps the C++ core for easy integration into data pipelines

Key Features

  • Subword embeddings handle misspellings, morphology, and rare words gracefully
  • Training speed: classifies millions of examples per second on a single CPU core
  • Pre-trained vectors for 157 languages trained on Common Crawl and Wikipedia
  • Automatic hyperparameter tuning via the autotune feature
  • Model compression through product quantization for mobile and edge deployment

Comparison with Similar Tools

  • Word2Vec — pioneered word embeddings but lacks subword information; fastText handles OOV words naturally
  • GloVe — global co-occurrence matrix approach; fastText is faster to train and supports subword units
  • spaCy — full NLP pipeline with built-in vectors; fastText focuses purely on embeddings and classification
  • Sentence Transformers — produces contextual sentence embeddings via Transformers; fastText is simpler and faster
  • scikit-learn text classifiers — flexible but slower on large datasets; fastText is optimized for scale

FAQ

Q: Can fastText handle languages with rich morphology? A: Yes. Subword n-grams capture morphological patterns, making it effective for agglutinative languages like Finnish, Turkish, and Korean.

Q: How does fastText compare to Transformer-based embeddings? A: Transformer models produce contextual embeddings and generally achieve higher accuracy on benchmarks, but fastText is orders of magnitude faster and works well when compute or latency budgets are tight.

Q: What format does the training data need? A: For supervised classification, each line should contain a label prefixed with label followed by the text. For unsupervised training, plain text with one sentence per line.

Q: Is fastText suitable for production use? A: Yes. The C++ core is fast and memory-efficient. Quantized models can run on mobile devices, and the library has been deployed at scale inside Meta.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets