Rule Content
Transcribing audio
To transcribe audio to generate captions in Remotion, you can use the transcribe() function from the @remotion/install-whisper-cpp package.
Prerequisites
First, the @remotion/install-whisper-cpp package needs to be installed. If it is not installed, use the following command:
npx remotion add @remotion/install-whisper-cppTranscribing
Make a Node.js script to download Whisper.cpp and a model, and transcribe the audio.
import path from "path";
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
toCaptions,
} from "@remotion/install-whisper-cpp";
import fs from "fs";
const to = path.join(process.cwd(), "whisper.cpp");
await installWhisperCpp({
to,
version: "1.5.5",
});
await downloadWhisperModel({
model: "medium.en",
folder: to,
});
// Convert the audio to a 16KHz wav file first if needed:
// import {execSync} from 'child_process';
// execSync('ffmpeg -i /path/to/audio.mp4 -ar 16000 /path/to/audio.wav -y');
const whisperCppOutput = await transcribe({
model: "medium.en",
whisperPath: to,
whisperCppVersion: "1.5.5",
inputPath: "/path/to/audio123.wav",
tokenLevelTimestamps: true,
});
// Optional: Apply our recommended postprocessing
const { captions } = toCaptions({
whisperCppOutput,
});
// Write it to the public/ folder so it can be fetched from Remotion
fs.writeFileSync("captions123.json", JSON.stringify(captions, null, 2));Transcribe each clip individually and create multiple JSON files.
See Displaying captions for how to display the captions in Remotion.