Remotion Rule: Display Captions
Remotion skill rule: Displaying captions in Remotion with TikTok-style pages and word highlighting. Part of the official Remotion Agent Skill for programmatic video in React.
What it is
This is a Remotion skill rule that teaches AI agents how to render captions in programmatic video. It implements TikTok-style paged captions with word-level highlighting that syncs to audio timing. The rule is part of the official Remotion Agent Skill for building video in React.
The skill targets developers and content creators using Remotion to produce short-form video (TikTok, YouTube Shorts, Reels) where synchronized captions are essential for engagement and accessibility.
How it saves time or tokens
Manually timing caption highlights frame-by-frame is tedious. This skill rule provides the pattern for automatic word-level synchronization based on subtitle timing data. An AI agent following this rule generates the caption component correctly on the first attempt, avoiding iterative debugging of timing issues.
How to use
- Add this skill rule to your Claude Code or Remotion agent configuration.
- Provide subtitle timing data (SRT or JSON format).
- The agent generates a React component that renders highlighted captions.
// Caption component following the Remotion skill rule
import { useCurrentFrame, useVideoConfig } from 'remotion';
interface Word {
text: string;
startFrame: number;
endFrame: number;
}
export const Caption: React.FC<{ words: Word[] }> = ({ words }) => {
const frame = useCurrentFrame();
const { fps } = useVideoConfig();
return (
<div style={{ position: 'absolute', bottom: 80, width: '100%', textAlign: 'center' }}>
{words.map((word, i) => (
<span
key={i}
style={{
color: frame >= word.startFrame && frame <= word.endFrame
? '#FFD700' : '#FFFFFF',
fontSize: 48,
fontWeight: 'bold',
textShadow: '2px 2px 4px rgba(0,0,0,0.8)',
}}
>
{word.text}{' '}
</span>
))}
</div>
);
};
Example
// Subtitle timing data for the Caption component
[
{"text": "Welcome", "startFrame": 0, "endFrame": 15},
{"text": "to", "startFrame": 16, "endFrame": 22},
{"text": "our", "startFrame": 23, "endFrame": 30},
{"text": "tutorial", "startFrame": 31, "endFrame": 50}
]
Related on TokRepo
- AI tools for video — Video production and editing tools
- AI tools for content — Content creation tools and workflows
Common pitfalls
- Word timing data must be frame-accurate. If your subtitle source uses seconds, convert to frames using the video's FPS (e.g., 30fps means 1 second = 30 frames).
- Page breaks in long captions need careful handling. Show 4-6 words per page maximum for readability on mobile screens.
- Font rendering differs across platforms. Test caption appearance in both the Remotion preview and the final rendered MP4.
Frequently Asked Questions
Yes. The caption component is a standard React component that uses Remotion's useCurrentFrame hook. It works in any Remotion project regardless of the video content. You pass word timing data as props and the component handles highlighting.
Use a speech-to-text service like Whisper, AssemblyAI, or Deepgram that provides word-level timestamps. Convert the timestamps from seconds to frames based on your video's FPS. Most services output JSON with start/end times per word.
Yes. The highlight color, font size, shadow, and position are all CSS properties you can modify. The skill rule provides a baseline TikTok-style look. Adjust colors and fonts to match your brand or video theme.
Yes. The caption component renders any text passed as word data. Unicode and CJK characters work. For right-to-left languages, add the appropriate CSS direction property to the container.
Yes. Remotion's interpolate and spring functions let you add scale, opacity, and position animations to each word as it highlights. The skill rule provides the basic pattern; you extend it with Remotion's animation utilities.
Citations (3)
- Remotion GitHub— Remotion is a framework for making videos programmatically in React
- Remotion Documentation— Remotion documentation for useCurrentFrame and video composition
- OpenAI Whisper GitHub— Whisper provides word-level speech-to-text timestamps
Related on TokRepo
Source & Thanks
Created by Remotion. Licensed under MIT. remotion-dev/skills — Rule:
display-captions
Part of the Remotion AI Skill collection on TokRepo.
Discussion
Related Assets
Claude-Flow — Multi-Agent Orchestration for Claude Code
Layers swarm and hive-mind multi-agent orchestration on top of Claude Code with 64 specialized agents, SQLite memory, and parallel execution.
ccusage — Real-Time Token Cost Tracker for Claude Code
CLI that reads ~/.claude logs and breaks down Claude Code token spend by day, session, and project — pluggable into your statusline.
SuperClaude — Workflow Framework for Claude Code
Adds 16+ slash commands, 9 cognitive personas, and a smart flag system to Claude Code in one pipx install.