Remotion Rule: Voiceover
Remotion skill rule: Adding AI-generated voiceover to Remotion compositions using TTS. Part of the official Remotion Agent Skill for programmatic video in React.
这个资产会安全暂存
这个资产会先安全暂存。复制的指令会要求 Agent 读取暂存文件,并在激活脚本、MCP 配置或全局配置前先确认。
npx -y tokrepo@latest install 47ec82a6-f5e7-4530-94cd-3de1508ccb43 --target codex先暂存文件;激活前需要读取暂存 README 和安装计划。
What it is
The Remotion Voiceover Rule is a skill rule from the official Remotion Agent Skill set. It provides a structured pattern for adding AI-generated speech audio to Remotion video compositions using text-to-speech (TTS) services, with ElevenLabs as the default provider. The rule activates automatically when working with voiceover in a Remotion project.
This rule is intended for developers building programmatic video with React via Remotion who want to add narration or voiceover without manual audio recording.
How it saves time or tokens
Without this rule, developers must figure out the TTS integration, audio file management, and composition duration matching from scratch. The rule provides a ready-made pattern: generate MP3 files per scene, use calculateMetadata to dynamically size the composition to match audio length, and wire everything together. This eliminates trial-and-error and keeps the AI coding agent on the correct path, saving both developer time and token usage during AI-assisted development.
How to use
- Install the Remotion skills package:
npx skills add remotion-dev/skills
- Set your ElevenLabs API key as an environment variable:
export ELEVENLABS_API_KEY=your_key_here
- Create a voiceover generation script that reads your scene config and calls the ElevenLabs API for each scene, writing MP3 files to the
public/directory. Then run it:
node --strip-types generate-voiceover.ts
- Use
calculateMetadatain your Remotion composition to read the audio duration and set the composition length dynamically.
Example
A minimal voiceover generation script structure:
import { ElevenLabsClient } from 'elevenlabs';
import fs from 'fs';
const client = new ElevenLabsClient({
apiKey: process.env.ELEVENLABS_API_KEY,
});
async function generateVoiceover(text: string, outputPath: string) {
const audio = await client.generate({
text,
voice: 'Rachel',
model_id: 'eleven_multilingual_v2',
});
const buffer = Buffer.from(await audio.arrayBuffer());
fs.writeFileSync(outputPath, buffer);
}
// Generate per scene
const scenes = [
{ text: 'Welcome to our product demo.', file: 'public/voice-1.mp3' },
{ text: 'Here is the key feature.', file: 'public/voice-2.mp3' },
];
for (const scene of scenes) {
await generateVoiceover(scene.text, scene.file);
}
Then in your Remotion composition, use calculateMetadata to read audio duration and set durationInFrames accordingly.
Related on TokRepo
- Automation tools — Explore automation tools for video and content workflows
- Video tools — Browse video creation and editing tools
Common pitfalls
- Forgetting to set the
ELEVENLABS_API_KEYenvironment variable before running the generation script. The script will fail silently or throw an auth error. - Not using
calculateMetadatato dynamically size the composition. HardcodingdurationInFramesleads to audio being cut off or having long silence at the end. - Generating all voiceover audio on every render. Cache the MP3 files and only regenerate when the script text changes.
常见问题
The rule defaults to ElevenLabs but explicitly states that any TTS service producing an audio file can be substituted. You need to swap out the API call in the generation script while keeping the same file output pattern.
The rule uses Remotion's calculateMetadata function to read the generated audio file, extract its duration, and set the composition's durationInFrames dynamically. This ensures video length matches the voiceover exactly.
Yes. The generation script processes scenes independently, so you can specify a different voice ID or provider per scene. Just update the voice parameter in each generation call.
Yes, but you must generate the audio files before the render starts and include them in the bundle. Remotion Lambda bundles the public/ directory, so pre-generated MP3 files are available during cloud rendering.
Run npx skills add remotion-dev/skills in your project directory. The rule activates automatically when the AI agent detects voiceover-related work in a Remotion project. No manual configuration is needed beyond the TTS API key.
引用来源 (3)
- Remotion Documentation— Remotion is a React framework for programmatic video
- ElevenLabs Documentation— ElevenLabs provides text-to-speech API
- Remotion calculateMetadata Docs— calculateMetadata dynamically sets composition props
TokRepo 相关
讨论
相关资产
Remotion Rule: Maps
Remotion skill rule: Make map animations with Mapbox. Part of the official Remotion Agent Skill for programmatic video in React.
Remotion Rule: Parameters
Remotion skill rule: Make a video parametrizable by adding a Zod schema. Part of the official Remotion Agent Skill for programmatic video in React.
Remotion Rule: Transparent Videos
Remotion skill rule: Rendering transparent videos in Remotion. Part of the official Remotion Agent Skill for programmatic video in React.
Remotion Rule: Text Animations
Remotion skill rule: Typography and text animation patterns for Remotion.. Part of the official Remotion Agent Skill for programmatic video in React.