ScriptsApr 1, 2026·2 min read

Bark — AI Text-to-Audio with Music & Effects

Bark is a transformer text-to-audio model by Suno that generates speech, music, and sound effects. 39.1K+ GitHub stars. 12+ languages, 100+ voice presets, non-speech audio. MIT licensed.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

# Install
pip install git+https://github.com/suno-ai/bark.git

# Generate speech
python -c "
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
preload_models()
audio = generate_audio('Hello, my name is Bark. [laughs] I can also make music!')
write_wav('bark_output.wav', SAMPLE_RATE, audio)
print('Saved bark_output.wav')
"

Requires ~12GB VRAM for full model. Use SUNO_USE_SMALL_MODELS=True for 8GB GPUs.


Intro

Bark is a transformer-based text-to-audio model by Suno that generates highly realistic multilingual speech, music, background noise, and sound effects from text prompts. With 39,100+ GitHub stars and MIT license, Bark supports 12+ languages (English, Spanish, French, German, Japanese, Korean, Chinese, and more), 100+ voice presets, non-speech audio generation (laughter, sighing, music), and long-form generation. It runs on enterprise GPUs in real-time and works on CPU, with 2x GPU and 10x CPU speed improvements.

Best for: Developers building creative audio applications — podcasts, games, voice assistants, music generation Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Languages: 12+ (English, Spanish, French, German, Japanese, Korean, Chinese, Hindi, and more)


Key Features

  • Text-to-audio: Speech, music, background noise, and sound effects from text
  • 12+ languages: Multilingual speech with natural prosody
  • 100+ voice presets: Diverse speakers across languages and styles
  • Non-speech sounds: Laughter, sighing, music, ambient noise via text tags
  • HuggingFace integration: Transformers v4.31.0+ compatible
  • Long-form generation: Extended audio via sequential generation
  • MIT licensed: Full commercial use permitted

FAQ

Q: What is Bark? A: Bark is a text-to-audio model by Suno with 39.1K+ stars. Generates realistic speech in 12+ languages, plus music and sound effects. 100+ voice presets. MIT licensed.

Q: How do I install Bark? A: Run pip install git+https://github.com/suno-ai/bark.git (not pip install bark). Requires ~12GB VRAM for full model.


🙏

Source & Thanks

Created by Suno AI. Licensed under MIT. suno-ai/bark — 39,100+ GitHub stars

Related Assets