Key Capabilities
- Text-to-video: Generate videos from text descriptions
- Image-to-video: Animate static images
- Variable resolution: 240p to 720p+
- Variable duration: 2s to 16s clips
- Fully trainable: Fine-tune on your own data
- Apache 2.0: Commercially usable
Architecture
Based on DiT (Diffusion Transformer) with spatial-temporal attention, trained on large-scale video datasets. Uses a VAE for video encoding and a text encoder for prompt understanding.
FAQ
Q: What is Open-Sora? A: An open-source text-to-video generation framework with an 11B parameter model, designed as an accessible alternative to OpenAI's Sora. 28,800+ GitHub stars, Apache 2.0 licensed.
Q: Is Open-Sora free? A: Yes. Open-Sora is Apache 2.0 licensed. You need your own GPU hardware for inference (recommended: NVIDIA A100 or equivalent).