SkillsApr 8, 2026·1 min read

Together AI Audio TTS/STT Skill for Claude Code

Skill that teaches Claude Code Together AI's audio API. Covers text-to-speech (REST and WebSocket streaming), speech-to-text transcription, and realtime voice interaction.

PR
Prompt Lab · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

npx skills add togethercomputer/skills

What is This Skill?

This skill teaches AI coding agents how to use Together AI's audio API for text-to-speech and speech-to-text. It covers REST TTS, streaming TTS via WebSocket, and speech-to-text transcription — with correct SDK patterns for each mode.

Answer-Ready: Together AI Audio Skill for coding agents. Covers TTS (REST + WebSocket streaming), STT transcription, and realtime voice. Correct SDK patterns and model IDs. Part of official 12-skill collection.

Best for: Developers building voice-enabled AI applications. Works with: Claude Code, Cursor, Codex CLI.

What the Agent Learns

Text-to-Speech (REST)

from together import Together

client = Together()
response = client.audio.speech.create(
    model="together-ai/tts-model",
    input="Hello, welcome to the demo!",
    voice="alloy",
)
response.stream_to_file("output.mp3")

Streaming TTS (WebSocket)

# Low-latency streaming for realtime applications
import websocket
# WebSocket connection for chunked audio streaming

Speech-to-Text

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="together-ai/whisper",
        file=f,
    )
print(transcript.text)

FAQ

Q: What voices are available? A: Multiple voices supported. Check Together AI docs for current voice list.

🙏

Source & Thanks

Part of togethercomputer/skills — MIT licensed.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets