# FunASR — End-to-End Speech Recognition Toolkit > FunASR is an open-source speech recognition toolkit by Alibaba DAMO Academy supporting ASR, voice activity detection, punctuation restoration, and text normalization. It ships pretrained models for 50+ languages and provides production-ready server deployment with streaming support. ## Install Save in your project root: # FunASR — End-to-End Speech Recognition Toolkit ## Quick Use ```bash pip install funasr python3 -c " from funasr import AutoModel model = AutoModel(model='iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch') result = model.generate(input='audio.wav') print(result) " ``` ## Introduction FunASR provides a complete pipeline for automatic speech recognition, from audio input to formatted text output. It bundles state-of-the-art pretrained models (Paraformer, SenseVoice, Whisper-compatible) with convenient Python APIs and a deployable gRPC/WebSocket server. ## What FunASR Does - Performs speech-to-text transcription for 50+ languages with pretrained models - Detects voice activity to segment audio into speech and silence regions - Restores punctuation and performs inverse text normalization on transcriptions - Supports both offline (batch) and online (streaming) recognition modes - Provides a runtime server for production deployment with GPU acceleration ## Architecture Overview FunASR's core is built on PyTorch and wraps multiple ASR architectures (Paraformer, Conformer, Transformer, Whisper) behind a unified AutoModel interface. The Paraformer model uses a non-autoregressive architecture with a predictor module that estimates token count, enabling single-pass parallel decoding. The runtime server is a C++ gRPC service that loads ONNX-exported models with ONNX Runtime for low-latency inference, accepting WebSocket connections for streaming audio. ## Self-Hosting & Configuration - Install via pip: pip install funasr (Python 3.8+) - Models download automatically from ModelScope or Hugging Face on first use - Deploy the production server using the Docker image: funasr-runtime-sdk-gpu - Configure the server via command-line flags for model paths, ports, and thread count - Stream audio to the server over WebSocket for real-time transcription ## Key Features - Paraformer achieves fast non-autoregressive decoding with high accuracy on Chinese and English - Streaming mode delivers partial results with low latency for live captioning - Supports hotword boosting to improve recognition of domain-specific terms - Includes speaker diarization to distinguish who is speaking - Production C++ runtime with ONNX optimization for enterprise deployment ## Comparison with Similar Tools - **Whisper (OpenAI)** — strong multilingual ASR; FunASR offers faster non-autoregressive models and a production server - **whisper.cpp** — C++ Whisper inference; FunASR provides a broader toolkit with VAD, punctuation, and diarization - **Faster Whisper** — CTranslate2-based speedup; FunASR's Paraformer is natively non-autoregressive for even lower latency - **Vosk** — offline speech recognition; FunASR supports both streaming and batch with a wider model zoo - **DeepSpeech** — Mozilla's end-to-end ASR (archived); FunASR is actively maintained with newer architectures ## FAQ **Q: Which languages are supported?** A: FunASR ships models covering 50+ languages, with particular strength in Chinese (including 7 dialects and 26 accents), English, Japanese, and Korean. **Q: Can I fine-tune models on my own data?** A: Yes. FunASR provides training scripts and recipes for fine-tuning any supported model on custom datasets. **Q: What is the recommended deployment for production?** A: Use the Docker-based runtime server with GPU support. It handles concurrent WebSocket connections and delivers optimized throughput via ONNX Runtime. **Q: How does Paraformer compare to Whisper in speed?** A: Paraformer's non-autoregressive decoding is significantly faster than Whisper's autoregressive approach, especially on long audio segments. ## Sources - https://github.com/modelscope/FunASR - https://www.funasr.com/ --- Source: https://tokrepo.com/en/workflows/asset-9e95d508 Author: AI Open Source