Pipeline de Video Corto con IA — TikTok / Reels / Shorts
Diez picks para el creador que publica video vertical corto en TikTok, Reels, Shorts y Douyin: pipelines de un clic (MoneyPrinterTurbo, OpenMontage), puentes a APIs comerciales como Sora, Runway y Kling (Together AI, Generative Media Skills), el pipeline programático Remotion para shorts con calidad cine, voiceover ElevenLabs, subtítulos VideoCaptioner, editor OpenCut y FFmpeg como base multimedia.
What this pack is for
Short-form video is no longer a single tool. It is a five-layer pipeline: pick a topic that is currently trending, write a hook-first script, generate or shoot the footage, burn in captions that read on a muted phone, then cut and publish across three to four platforms that each have their own algorithm and aspect ratio.
This pack assembles ten picks that cover each layer for the TikTok / Reels / Shorts / Douyin creator — opinionated, not exhaustive — so you can move from a trending topic to a published vertical short without stitching docs from ten different repos. The split is deliberate: a one-click path for the daily 3-shorts-a-day grind (MoneyPrinterTurbo, OpenMontage), a commercial-API bridge for hero shots where you need Sora / Veo / Runway / Kling quality, and a programmatic path (Remotion) for the moments when you actually care about typography, motion design and brand consistency.
Install in this order
1. Trending discovery + orchestration
- OpenMontage (#855) — AI Video Production System that ingests a topic and orchestrates the rest of the pipeline. Treat it as the conductor when you don't want to glue MoneyPrinterTurbo + Remotion + FFmpeg by hand.
- Video AI Toolkit (#107) — curated reference collection. Read it once to know what each piece in the ecosystem actually does before you commit to a stack.
2. Script — hook in the first 3 seconds
- MoneyPrinterTurbo (#108) — topic in, finished 9:16 short out. The script step runs on OpenAI / Gemini / DeepSeek and produces a hook-first voiceover script in under a minute. Use it as the script generator even if you replace the rest of the pipeline.
- Together AI Video Generation Skill (#777) — commercial-API bridge. The same skill that calls Sora / Veo / Runway will happily call an LLM to draft a script in the same workflow. Pay per call, no infra.
3. Footage — AI-generated, stock, or programmatic
- Remotion AI Video Production Skill (#1150) — cinema-grade short videos in React. When the look matters and you want exact typography, motion design and brand consistency, this is the path. Renders to mp4 with FFmpeg under the hood.
- Generative Media Skills (#3602) — muapi +
npx skills addinstaller that unifies Sora / Runway / Kling / Pika and a dozen other commercial generation APIs behind one CLI. Use this when an agent needs to call "generate a 5-second clip" without picking a vendor every time.
4. Captions — readable on a muted phone
- VideoCaptioner (#110) — end-to-end AI subtitle pipeline. Transcribes, segments, styles, and burns word-level captions. Built for vertical 9:16 output where the safe-area is narrow.
- Remotion AI Voiceover Skill — ElevenLabs TTS (#102) — generate the voiceover with ElevenLabs and sync it to your Remotion comp. Pair with the captions skill so the burned-in text actually matches the audio.
5. Edit + publish
- OpenCut (#4027) — open-source AI video editor. Trim, splice, color-match generated clips. Avoids the export round-trip to a closed NLE when an agent needs to do the final cut.
- FFmpeg (#1157) — the universal multimedia backbone. Every pick above shells out to FFmpeg eventually. Worth installing as a first-class CLI so you can re-encode for each platform's preferred codec / bitrate without re-rendering.
How they fit together
Trending topic
│
▼
OpenMontage ──► MoneyPrinterTurbo (script + assemble)
│ │
│ ┌───────┴───────┐
│ ▼ ▼
│ Together AI Generative Media Skills
│ (Sora/Veo) (Sora/Runway/Kling/Pika)
│ │ │
│ └───────┬───────┘
│ ▼
│ Raw 9:16 clips
│ │
│ ┌───────┴───────┐
│ ▼ ▼
│ Remotion ElevenLabs VO
│ (typography + (Remotion AI
│ motion design) Voiceover Skill)
│ │ │
│ └───────┬───────┘
│ ▼
│ VideoCaptioner
│ (word-level captions)
│ │
│ ▼
│ OpenCut
│ (final cut)
│ │
│ ▼
└──► FFmpeg
(re-encode per platform)
│
┌─────────┬─────┴─────┬──────────┐
▼ ▼ ▼ ▼
TikTok Reels Shorts Douyin
The two paths that matter: one-click (MoneyPrinterTurbo or OpenMontage straight to OpenCut) for the daily volume, and programmatic (Remotion + ElevenLabs + VideoCaptioner) for the brand pieces where typography and timing have to be exact. Commercial APIs slot into both paths for hero shots.
Tradeoffs you'll hit
- AI-generated footage vs real footage — AI generation is fast and infinite, but algorithms on TikTok and Reels are increasingly tuned to detect and downrank pure AI content. Mix: AI for B-roll, real footage for A-roll, real voice (or a near-real ElevenLabs voice) over the top.
- One-click vs programmatic — MoneyPrinterTurbo gets you a publishable short in 5 minutes; Remotion takes 5 hours the first time and 30 minutes thereafter. Use one-click for the daily grind, Remotion for series pieces where the look has to stay consistent across 50 episodes.
- One master vs platform-specific cuts — a single 9:16 master ships to all four platforms in minutes; platform-specific cuts (CapCut-style hook on TikTok, retention curve recut on Shorts, watermark-free for Reels) materially boost reach. Start with one master, A/B against a recut once you have data.
- Local generation vs commercial API — local models (Open-Sora, CogVideo) are free per second but eat GPU hours and tuning. Commercial APIs (Sora / Runway / Kling via Together AI or Generative Media Skills) are priced per second and ship the same day. Run local for iteration, API for the hero shots that have to land.
Common pitfalls
- Weak hook → 3-second drop-off — TikTok and Reels measure watch-through in the first 3 seconds. "Hi guys, today I want to talk about" is a guaranteed drop-off. Open with a question, a number, or a contradiction. MoneyPrinterTurbo's default script prompt is a starting point, not a finished hook.
- Caption colour and position — white text on a bright background is unreadable; bottom-third captions get covered by TikTok's UI chrome. Place captions in the middle third with a contrasting stroke or background plate. VideoCaptioner has presets for each platform's safe area.
- Reels rejects content with a visible TikTok watermark — and Shorts down-ranks it. Always export the master without platform watermarks; OpenCut's clean export is safer than re-uploading a TikTok download.
- Publish time not researched — the same short posted at 7am vs 9pm local time can differ 5-10× in initial impressions. Each platform has different peaks; read the in-app analytics before you settle on a posting schedule.
- AI-content detection / reach throttling — TikTok, YouTube Shorts and Douyin all run detection passes that down-rank pure-AI uploads. Add real-camera B-roll, a real-voice intro, or screen-recorded UI to keep the trust signal up — fully synthetic videos cap out fast.
10 recursos listos para instalar
Preguntas frecuentes
Will fully AI-generated short videos get throttled by TikTok / Reels / Shorts?
Increasingly yes. All three platforms have shipped AI-detection signals into ranking in the last 12 months. Pure synthetic uploads can publish but typically cap on reach. The practical workaround is hybrid: keep AI generation for B-roll and stock-replacement shots, but mix in real camera footage, a real-voice opening, or screen-recorded UI for the first 3 seconds. The hook in particular benefits from sounding human even if the body is synthesized.
Which captions tool is most accurate for vertical short video?
For English and major European languages, Whisper (the model VideoCaptioner wraps) is currently the accuracy benchmark and runs locally for free. For Chinese, Japanese and Korean, VideoCaptioner ships dedicated pipelines that outperform raw Whisper because they handle segment length and word-level timing better for tighter safe areas on vertical screens. Either way, plan to hand-correct numbers, brand names and proper nouns — no automatic captioner gets those right at 100%.
Sora vs Runway vs Kling — which commercial API should I default to?
Default to whichever one your script's content survives a content policy review. Sora is strongest on natural-language prompts and physical coherence; Runway is strongest on stylized motion and existing-image-to-video; Kling tends to be strongest on human and dance motion. The Generative Media Skills (#3602) installer lets you call all three behind one CLI so you can prompt the same scene through each and pick the winner — that is the practical workflow rather than picking one upfront.
One master cut for all four platforms, or recut per platform?
Start with one 9:16 master and ship to all four platforms — that is the path to volume. Once you have data on any single short that materially out-performs (say 3× your average on Reels but flat on TikTok), recut that specific short for the winning platform: tighter hook, different cover frame, captions repositioned for that platform's UI chrome. Per-platform recuts are an optimization tax on winners, not a pre-publish step on every upload.
How do I write a viral hook in the first 3 seconds?
Three patterns that survive across platforms: (1) state a contradiction — "You're using FFmpeg wrong, here's why"; (2) lead with a number that demands resolution — "I tried 12 AI video tools, 9 were unusable"; (3) ask a question whose answer is the rest of the video — "Why does this 200-line script outperform $500/mo SaaS?". Avoid hellos, intros, or anything that sounds like a podcast intro. MoneyPrinterTurbo and Together AI script prompts both accept hook templates — feed them one of these three patterns rather than letting the model default to greetings.
12 packs · 80+ recursos seleccionados
Explora todos los packs curados en la página principal
Volver a todos los packs