What is This Skill?
This skill teaches AI coding agents how to use Together AI's audio API for text-to-speech and speech-to-text. It covers REST TTS, streaming TTS via WebSocket, and speech-to-text transcription — with correct SDK patterns for each mode.
Answer-Ready: Together AI Audio Skill for coding agents. Covers TTS (REST + WebSocket streaming), STT transcription, and realtime voice. Correct SDK patterns and model IDs. Part of official 12-skill collection.
Best for: Developers building voice-enabled AI applications. Works with: Claude Code, Cursor, Codex CLI.
What the Agent Learns
Text-to-Speech (REST)
from together import Together
client = Together()
response = client.audio.speech.create(
model="together-ai/tts-model",
input="Hello, welcome to the demo!",
voice="alloy",
)
response.stream_to_file("output.mp3")Streaming TTS (WebSocket)
# Low-latency streaming for realtime applications
import websocket
# WebSocket connection for chunked audio streamingSpeech-to-Text
with open("audio.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="together-ai/whisper",
file=f,
)
print(transcript.text)FAQ
Q: What voices are available? A: Multiple voices supported. Check Together AI docs for current voice list.