SkillsMar 29, 2026·1 min read

Remotion AI Voiceover Skill — ElevenLabs TTS

AI skill for adding ElevenLabs text-to-speech voiceover to Remotion videos. Auto-sizes composition duration to match generated audio.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

npx skills add remotion-dev/skills
# Set your ElevenLabs API key
export ELEVENLABS_API_KEY=your_key

Intro

A Remotion skill for AI-generated voiceover using ElevenLabs TTS. Generate speech audio per scene, then use calculateMetadata to dynamically size the video composition to match. Perfect for automated video pipelines where narration needs to be generated programmatically. Part of the Remotion AI Skills collection.

Best for: Automated video narration, explainer videos, podcast visualizations Works with: Claude Code, OpenAI Codex, Cursor


How It Works

  1. Define your script — Text for each scene in a config file
  2. Generate audio — Script calls ElevenLabs API, writes MP3s to public/
  3. Dynamic durationcalculateMetadata reads audio duration, sizes composition accordingly
  4. Render — Remotion renders video with synced voiceover

Generating Audio with ElevenLabs

// generate-voiceover.ts
const response = await fetch("https://api.elevenlabs.io/v1/text-to-speech/{voice_id}", {
  method: "POST",
  headers: {
    "xi-api-key": process.env.ELEVENLABS_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    text: sceneText,
    model_id: "eleven_multilingual_v2",
  }),
});
// Write audio to public/voiceover-scene-1.mp3

Run: node --strip-types generate-voiceover.ts

Dynamic Composition Duration

export const calculateMetadata = async () => {
  const duration = await getAudioDurationInSeconds(staticFile("voiceover.mp3"));
  return { durationInFrames: Math.ceil(duration * 30) };
};

FAQ

Q: What TTS service does the Remotion voiceover skill use? A: ElevenLabs by default, but any TTS service that produces audio files can be substituted.

Q: Does the video duration auto-adjust to the voiceover? A: Yes. The skill uses Remotion's calculateMetadata to dynamically set composition duration based on the generated audio length.


🙏

Source & Thanks

Created by Remotion. Licensed under MIT. remotion-dev/skills — Voiceover rule

Related Assets